[jira] Commented: (HIVE-287) support count(*) and count distinct on multiple columns
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891889#action_12891889 ] Arvind Prabhakar commented on HIVE-287: --- Updated the wiki in the all the above places. > support count(*) and count distinct on multiple columns > --- > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, > HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) support count(*) and count distinct on multiple columns
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888106#action_12888106 ] Arvind Prabhakar commented on HIVE-287: --- @John: I updated the UDAF documentation on the wiki at http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF and also added a short blurb regarding the various interface changes on the UDAF tutorial page http://wiki.apache.org/hadoop/Hive/GenericUDAFCaseStudy#Writing_the_source. Please let me know if there are other places that need to be updated as well. > support count(*) and count distinct on multiple columns > --- > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, > HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886950#action_12886950 ] Arvind Prabhakar commented on HIVE-287: --- bq. 1. Change the comments for the 2 new fields. It's easy for UDAF writers to assume that the UDAF itself needs to handle whether it's distinct or whether it's all columns. I updated the javadocs in various places to make this clear. bq. 2. Deprecate the old interface, and move all existing GenericUDAF to inherit from the new one. No changes necessary for this - the previously submitted patch also did it. All existing generic UDAFs now extend from the abstract class that implements the new interface. If you see any problems with that let me know. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, > HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-6-trunk.patch HIVE-287-6-branch-0.6.patch > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch, > HIVE-287-6-branch-0.6.patch, HIVE-287-6-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886804#action_12886804 ] Arvind Prabhakar commented on HIVE-287: --- I think keeping two different interfaces for UDAFs will lead to confusion in the long run. Thats why the current patch deprecates the old interface in favor of the new one. But if all agree that it is a good idea, then I will go with that. Also - can you suggest an alternate name for the new interface? > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886769#action_12886769 ] Arvind Prabhakar commented on HIVE-287: --- I vote for a meeting to hash this out face-to-face. I am willing to modify the patch provided we all are in agreement as to how it should be changed. It will be much better use of everyone's time to avoid the numerous deltas to the patch before settling in on the final solution. Please let me know what you think. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886339#action_12886339 ] Arvind Prabhakar commented on HIVE-287: --- @Zheng: Welcome to the party. bq. Why do we put the DISTINCT in the information? DISTINCT is currently done by the framework, instead of individual UDAF. This is good because the logic of removing duplicates are common for all UDAFs. We do support SUM(DISTINCT val). Providing the information in the parameter specification is not the same as enforcing its interpretation. This is provided primarily to ensure that UDAFs that rely on this information can make appropriate decisions. For example, we wanted to disallow the invocation {{COUNT( EXPR1, EXPR2 ...)}} in favor of {{COUNT(*DISTINCT* EXPR1, EXPR2 ...)}}. Without this information, the count UDAF will not be able to enforce the later syntax. bq. Why do we special-case ""? It seems to me that "" is just a short-cut. Hive already supports regex-based multi-column specification, so that we can say `abc.*` for all columns with name starting with abc. The compiler should just expand * and give all the columns to the UDAF. If you wish to use \* as a regular expression, you would have to quote it as a string - {{COUNT('\*')}}. This is different from the invocation as specified in SQL which treats \* as a terminal symbol. So if it is OK to deviate from the standard representation, the user can easily use the quoted string representation to achieve the effect similar to {{COUNT(col1, col2 ..)}}. The semantics of this should be more like {{COUNT(DISTINCT EXPR1, EXPR2 ...)}} as opposed to {{COUNT(\*)}}. bq. Since COUNT(\*) is a special-case in the SQL standard (COUNT(\*) is different from COUNT(col) even if the table has a single column col), I think we should just special-case that and replace that with count(1) at some place. Are you suggesting that we allow the grammar to express {{COUNT(\*)}} syntax, but in the lexical analysis stage turn it into a {{COUNT(1)}}? I can see how that may work - but personally I am not a fan of such an approach. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1453) Build configuration changes introduced regression in launch configurations
[ https://issues.apache.org/jira/browse/HIVE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886131#action_12886131 ] Arvind Prabhakar commented on HIVE-1453: Review posted: http://review.hbase.org/r/280/ > Build configuration changes introduced regression in launch configurations > -- > > Key: HIVE-1453 > URL: https://issues.apache.org/jira/browse/HIVE-1453 > Project: Hadoop Hive > Issue Type: Bug > Environment: All Eclipse environments >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Attachments: HIVE-1453.patch > > > The changes to prepare for branching out 0.6.0 required [changes to build > configuration|http://svn.apache.org/viewvc/hadoop/hive/trunk/build.properties?r1=952877&r2=956430] > which caused the launch configurations to break as the jars they referred to > were renamed automatically. As a result, none of the launch configurations > are working at this point. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1453) Build configuration changes introduced regression in launch configurations
[ https://issues.apache.org/jira/browse/HIVE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1453: --- Status: Patch Available (was: Open) Modified the launch configurations to use parameterized version suffix for jars instead of hardcoding them to a specific jar version. > Build configuration changes introduced regression in launch configurations > -- > > Key: HIVE-1453 > URL: https://issues.apache.org/jira/browse/HIVE-1453 > Project: Hadoop Hive > Issue Type: Bug > Environment: All Eclipse environments >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Attachments: HIVE-1453.patch > > > The changes to prepare for branching out 0.6.0 required [changes to build > configuration|http://svn.apache.org/viewvc/hadoop/hive/trunk/build.properties?r1=952877&r2=956430] > which caused the launch configurations to break as the jars they referred to > were renamed automatically. As a result, none of the launch configurations > are working at this point. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1453) Build configuration changes introduced regression in launch configurations
[ https://issues.apache.org/jira/browse/HIVE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1453: --- Attachment: HIVE-1453.patch > Build configuration changes introduced regression in launch configurations > -- > > Key: HIVE-1453 > URL: https://issues.apache.org/jira/browse/HIVE-1453 > Project: Hadoop Hive > Issue Type: Bug > Environment: All Eclipse environments >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Attachments: HIVE-1453.patch > > > The changes to prepare for branching out 0.6.0 required [changes to build > configuration|http://svn.apache.org/viewvc/hadoop/hive/trunk/build.properties?r1=952877&r2=956430] > which caused the launch configurations to break as the jars they referred to > were renamed automatically. As a result, none of the launch configurations > are working at this point. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1453) Build configuration changes introduced regression in launch configurations
Build configuration changes introduced regression in launch configurations -- Key: HIVE-1453 URL: https://issues.apache.org/jira/browse/HIVE-1453 Project: Hadoop Hive Issue Type: Bug Environment: All Eclipse environments Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar The changes to prepare for branching out 0.6.0 required [changes to build configuration|http://svn.apache.org/viewvc/hadoop/hive/trunk/build.properties?r1=952877&r2=956430] which caused the launch configurations to break as the jars they referred to were renamed automatically. As a result, none of the launch configurations are working at this point. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1432) Create a test case for case sensitive comparison done during field comparison
[ https://issues.apache.org/jira/browse/HIVE-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885773#action_12885773 ] Arvind Prabhakar commented on HIVE-1432: Review posted: http://review.hbase.org/r/276/ > Create a test case for case sensitive comparison done during field comparison > - > > Key: HIVE-1432 > URL: https://issues.apache.org/jira/browse/HIVE-1432 > Project: Hadoop Hive > Issue Type: Task > Components: Query Processor >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.7.0 > > Attachments: HIVE-1432.patch > > > See HIVE-1271. This jira tracks the creation of a test case to test this fix > specifically. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1451) Creating a table stores the full address of namenode in the metadata. This leads to problems when the namenode address changes.
Creating a table stores the full address of namenode in the metadata. This leads to problems when the namenode address changes. --- Key: HIVE-1451 URL: https://issues.apache.org/jira/browse/HIVE-1451 Project: Hadoop Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.5.0 Environment: Any Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar Here is an excerpt from table metadata for an arbitrary table {{table1}}: {noformat} hive> describe extended table1; OK ... Detailed Table Information ... location:hdfs://localhost:9000/user/arvind/hive/warehouse/table1, ... {noformat} As can be seen, the full address of namenode is captured in the location information for the table. This information is later used to run any queries on the table - thus making it impossible to change the namenode location once the table has been created. For example, for the above table, a query will fail if the namenode is migrated from port 9000 to 8020: {noformat} hive> select * from table1; OK Failed with exception java.io.IOException:java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused Time taken: 10.78 seconds hive> {noformat} It should be possible to change the namenode location regardless of when the tables are created. Also, any query execution should work with the configured namenode at that point in time rather than requiring the configuration to be exactly the same at the time when the tables were created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885766#action_12885766 ] Arvind Prabhakar commented on HIVE-287: --- Review board review posted: http://review.hbase.org/r/275/ > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Status: Patch Available (was: Open) Uploaded patch for trunk and branch-0.6. Ran all the tests on trunk and did spot testing on branch-0.6. *Changes from Previous patch:* * Modified the implementation of {{AbstractGenericUDAFResolver}} to raise an exception when invoked with the {{UDAF(STAR)}} syntax. * Added negative test cases to assert that the current UDAFs present in the code other than {{COUNT}} do not accept the {{UDAF(STAR)}} syntax. * Added {{EXPLAIN}} directives for the queries run in {{udf_count.q}} test file. Will attempt to post the patch on review board as well. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-5-trunk.patch HIVE-287-5-branch-0.6.patch > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1432) Create a test case for case sensitive comparison done during field comparison
[ https://issues.apache.org/jira/browse/HIVE-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884875#action_12884875 ] Arvind Prabhakar commented on HIVE-1432: Adding the test case to exercise the fix for HIVE-1271. This test was run manually on trunk revision prior to commit of HIVE-1271 and produced the following error: {noformat} 2010-07-02 17:13:05,085 DEBUG lazy.LazySimpleSerDe (LazySimpleSerDe.java:initialize(212)) - LazySimpleSerDe initialized with: columnNames=[info] columnTypes=[struct] separator=...@dbf2988] nullstring=\N lastColumnTakesRest=false 2010-07-02 17:13:05,089 ERROR ql.Driver (SessionState.java:printError(277)) - FAILED: Error in semantic analysis: line 4:23 Cannot insert into target table because column number/types are different table2: Cannot convert column 0 from struct to struct. org.apache.hadoop.hive.ql.parse.SemanticException: line 4:23 Cannot insert into target table because column number/types are different table2: Cannot convert column 0 from struct to struct. at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:3573) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:3434) {noformat} This test passes on the current trunk. > Create a test case for case sensitive comparison done during field comparison > - > > Key: HIVE-1432 > URL: https://issues.apache.org/jira/browse/HIVE-1432 > Project: Hadoop Hive > Issue Type: Task > Components: Query Processor >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1432.patch > > > See HIVE-1271. This jira tracks the creation of a test case to test this fix > specifically. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1432) Create a test case for case sensitive comparison done during field comparison
[ https://issues.apache.org/jira/browse/HIVE-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1432: --- Status: Patch Available (was: Open) Patch available. Please review. > Create a test case for case sensitive comparison done during field comparison > - > > Key: HIVE-1432 > URL: https://issues.apache.org/jira/browse/HIVE-1432 > Project: Hadoop Hive > Issue Type: Task > Components: Query Processor >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1432.patch > > > See HIVE-1271. This jira tracks the creation of a test case to test this fix > specifically. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1432) Create a test case for case sensitive comparison done during field comparison
[ https://issues.apache.org/jira/browse/HIVE-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1432: --- Attachment: HIVE-1432.patch > Create a test case for case sensitive comparison done during field comparison > - > > Key: HIVE-1432 > URL: https://issues.apache.org/jira/browse/HIVE-1432 > Project: Hadoop Hive > Issue Type: Task > Components: Query Processor >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1432.patch > > > See HIVE-1271. This jira tracks the creation of a test case to test this fix > specifically. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884580#action_12884580 ] Arvind Prabhakar commented on HIVE-287: --- Thanks for the explanation John. The SQL BNF that you pointed out is the normative SQL specification. I do not think any SQL implementations use this grammar though. The parallel is that of an interface and its implementation. While the interface can be short and precise, the implementations may choose to delegate interface methods to other implementation specific methods. Similarly, most databases deal with their own SQL grammar that is compliant with the SQL standard at specific levels. More to the point in Hive - my key concern is that by modifying the grammar to make an exception for {{COUNT}}, we will be introducing a brittle coupling between the the parser and semantic analyzer. Right now the count aggregate function is treated like any other function and is thus part of the general framework. By making this change, we will be modifying it to be specifically associated from with the grammar directives. This is the current function definition in Hive QL grammar (*A*): {noformat} -->[ functionName ]-->[ LPAREN ]--+-->[ KW_DISTINCT ]--+--+--+-->[ expression ]--+--+-->[ RPAREN ]--> || | | | | +--->+ | +--[ COMMA ]<---+ | | | +->---+ {noformat} The patch that I have supplied already on this Jira modifies this definition as follows (*B*): {noformat} -->[ functionName ]-->[ LPAREN ]--+>[ STAR ]--+-->[ RPAREN ]--> | | +--+-->[ KW_DISTINCT ]--+--+--+-->[ expression ]--+--+--+ || | | | | +--->+ | +--[ COMMA ]<---+ | | | +->---+ {noformat} If I were to modify the grammar to make an exception for {{COUNT}} it will likely be changed to something like this (*C*): {noformat} --+-->[ KW_COUNT ]-->[ LPAREN ]-->[ STAR ]-->[ RPAREN ]+--> | | +-->[ functionName ]-->[ LPAREN ]--+--+-->[ KW_DISTINCT ]--+--+-->[ expression ]--+--+-->[ RPAREN ]--+ | || | | | | +--->+ +[ COMMA ]<-+ | | | +->---+ {noformat} Consider the *C* approach closely: The production that matches a {{COUNT}} invocation can be directly matched via the top branch using {{KW_COUNT}} token, or it could follow the branch below where {{functionName}} could match {{COUNT}}. On the semantic analyzer side, it makes the matching logic more complex and less intuitive since now the {{COUNT}} can be invoked via two branch conditions. For example - there would be one invocation that would directly delegate to the {{COUNT}} aggregate function, whereas another that will use the current resolver mechanism to invoke it. Instead, the approach *B* keeps the grammar consistent with the regular function invocation. It does not favor any one function over the other and simply establishes matching rules for function production. That way, the call is then delegated to the semantic analyzer which in turn matches the appropriate handling function based on the name and parameter type using the generic resolver mechanism without regard to what function is being invoked. The changes supplied in the current patch also allow individual function handlers to decide if they would like to support the {{functionName(STAR)}} syntax. Since you feel strongly about not supporting this syntax by default for any function, I can perhaps modify the {{AbstractGenericUDAFResolver}} class to raise an exception if invoked with this syntax. That way, only the functions that choose to overwrite that behavior will be able to support it. Also as you can see in the syntax diagram for (B), there is no production that will match things like {{functionName(DISTINCT STAR)}} or {{functionName(STAR, EX
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884125#action_12884125 ] Arvind Prabhakar commented on HIVE-287: --- @John - are you suggesting that the grammar be updated to restrict single star argument with the specific function {{COUNT}}? If not in the grammar where else do you think these restrictions should be coded. In either case, what other subsystems you think will be impacted by this change and what do you suggest should be the downstream changes to accomodate this? p.s. I ask these questions to best utilize both our time and reduce the number of back/froths to the extent possible. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883214#action_12883214 ] Arvind Prabhakar commented on HIVE-1271: @Zheng: Please see the second comment. This patch uses C2 method - comparing field names in a case insensitive manner. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882048#action_12882048 ] Arvind Prabhakar commented on HIVE-1271: @Ashish: I created HIVE-1432 to track the test case creation. I will be submitting a patch for that soon. Thanks for pointing this out. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1432) Create a test case for case sensitive comparison done during field comparison
Create a test case for case sensitive comparison done during field comparison - Key: HIVE-1432 URL: https://issues.apache.org/jira/browse/HIVE-1432 Project: Hadoop Hive Issue Type: Task Components: Query Processor Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar Fix For: 0.6.0 See HIVE-1271. This jira tracks the creation of a test case to test this fix specifically. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-6.patch > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176-5.patch, HIVE-1176-6.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881956#action_12881956 ] Arvind Prabhakar commented on HIVE-1176: yes - thats what my intention was. Thanks for catching it. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176-5.patch, HIVE-1176-6.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881946#action_12881946 ] Arvind Prabhakar commented on HIVE-1176: @John: done. Please see the new patch attachment - HIVE-1176-5.patch Since a lot of good points came out of the discussion on this jira, I took the liberty of adding them to the Hive wiki for posterity. You can find it [here|http://wiki.apache.org/hadoop/Hive/TipsForAddingNewTests]. Please add to it any other points that you feel contributors should take into consideration while adding new tests. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176-5.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-5.patch > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176-5.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-4.patch > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881566#action_12881566 ] Arvind Prabhakar commented on HIVE-1176: @Paul: You suggestions are fair enough. I have incorporated all changes you suggested except for the pre-drop based on @John's response. Let me know if you guys need any further tweaking of this patch. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176-4.patch, HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881538#action_12881538 ] Arvind Prabhakar commented on HIVE-1176: Updated patch with a test case attached. Please use HIVE-1176-3.patch. The changed files in this patch are as follows: # modified: build.properties # modified: build.xml # new file: data/files/simple.txt # modified: eclipse-templates/.classpath # modified: ivy/ivysettings.xml # deleted:lib/datanucleus-core-1.1.2.LICENSE # deleted:lib/datanucleus-core-1.1.2.jar # deleted:lib/datanucleus-enhancer-1.1.2.LICENSE # deleted:lib/datanucleus-enhancer-1.1.2.jar # deleted:lib/datanucleus-rdbms-1.1.2.LICENSE # deleted:lib/datanucleus-rdbms-1.1.2.jar # deleted:lib/jdo2-api-2.3-SNAPSHOT.LICENSE # deleted:lib/jdo2-api-2.3-SNAPSHOT.jar # modified: metastore/ivy.xml # modified: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java # new file: ql/src/test/queries/clientpositive/hive_1176.q # new file: ql/src/test/results/clientpositive/hive_1176.q.out > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-3.patch > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, HIVE-1176-3.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881499#action_12881499 ] Arvind Prabhakar commented on HIVE-1176: Makes sense. Will add a test case and update the patch soon. Sorry for the misunderstanding. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881488#action_12881488 ] Arvind Prabhakar commented on HIVE-1176: Also, for the specific change to {{HiveMetaStoreClient.java}} - the tests under {{metastore}} validate that the new libraries are working correctly. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881486#action_12881486 ] Arvind Prabhakar commented on HIVE-1176: Sorry - it is not clear to me what unit test should I be writing. Can you give an example perhaps? >From my perspective, any test that uses the metastore exercises this change. >And together, all the tests form an exhaustive layer that ensures that there >is no regression seeping into the system. Note that this is not a >functionality change, only a change of underlying libraries. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881472#action_12881472 ] Arvind Prabhakar commented on HIVE-1176: Yes - it appears that the change in behavior can be attributed to the difference in major versions. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881381#action_12881381 ] Arvind Prabhakar commented on HIVE-1176: @Paul: I just tested the patch (HIVE-1176-2.patch) on latest trunk and it seems to apply cleanly. Can you please try again and see if it works? Also, can you post the errors that you are seeing? If necessary, I can break down the patch into single-file units to help with applying it. Just let me know either way. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881368#action_12881368 ] Arvind Prabhakar commented on HIVE-1271: @Ashish: Thanks for looking at the patch. bq. why remove the check on Category? I modified all the specialized type infos to be {{final}} - which in turn ensures that if the test on {{instanceof}} succeeds, then they have to be the same category type. Therefore, the check on category was redundant going forward. bq. Also why drop the default implementation of the equals method for TypeInfo? I did this for two main reasons - first that fact that it was implementing the {{equals()}} but not {{hashCode()}} method. This could lead to unexpected behavior when {{TypeInfo}} instances were put in collections. Second, the implementation was modified to make both {{equals()}} and {{hashCode()}} methods to be made abstract in order to force any (new) child classes to make sure that they implement both consistently. Let me know if you would like to tweak this change as necessary. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881024#action_12881024 ] Arvind Prabhakar commented on HIVE-1271: Is anyone reviewing this change? Thanks. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881023#action_12881023 ] Arvind Prabhakar commented on HIVE-1176: @Paul: Any updates on this from your end? Thanks. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881020#action_12881020 ] Arvind Prabhakar commented on HIVE-287: --- @John: Can you please take a look at the updated patch? Let me know if you have any feedback for further tweaking this change as necessary. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-4.patch applies cleanly on trunk and branch-0.6 > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1288#action_1288 ] Arvind Prabhakar commented on HIVE-287: --- @John: I agree with your assessment above. Regarding the count(*), my earlier comment was not to imply that there exists a UDAF today, but that it might exist in the future. More importantly though, using an empty parameter list as an indicator for * would blur the distinction between UDAF(*) vs UDAF() invocation. This is one way of many perhaps where parameter overloading could lead to confusion and hard to understand code. I think introducing {{GenericUDAFResolver2}} interface is a great idea. I also like the idea of using a call back for decoupling the invocation from parameter list but am concerned that this could lead to perhaps redundant method call and object creation. I am not sure if that would add to any significant performance penalty in the long run or not. I would love to know what the opinion of others interested in this issue is regarding this route. If all agree that adding a new interface with callback for parameter discovery is acceptable, I can start working on that patch. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879983#action_12879983 ] Arvind Prabhakar commented on HIVE-287: --- @John: Thanks for reviewing this change. I have some follow-up comments and suggestions: bq. isDistinct: this doesn't actually modify the choice of evaluator implementation at all, since the actual duplicate elimination takes place upstream of the UDAF invocation. So instead of adding this parameter, can we instead add a new method supportsDistinct() on GenericUDAFEvaluator? While the evaluation may be happening upstream, I was concerned that it does not exclude the cases where this information is relevant to the function invocation itself. For example, the implementation of {{count}} requires that if there is a valid argument list, it must be qualified with {{DISTINCT}}. bq. isAllColumns: COUNT is probably the only function which is ever even going to care about this one. Couldn't we just use an empty array of TypeInfo to indicate all columns? I had a similar idea, but after some consideration opted for a simpler design. I felt that overloading arguments to indicate special cases might lead to confusion and eventual problem when a use-case emerges that invalidates this assumption. I do agree with your point that it will be good to stay compatible if possible. One way to do it would be as follows: # Revert the {{GenericUDAFResolver}} to its previous state but make the interface deprecated in favor of the abstract base class. # Push the newly introduced method into {{AbstractGenericUDAFResolver}} implementation. # Modify {{FunctionRegistry.getGenericUDAFEvaluator()}} method to test the resolver instance to be type compatible with {{AbstractGenericUDAFResolver}} and if so, invoke the new method. Otherwise revert to the old mechanism. What do you think about this approach? > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879441#action_12879441 ] Arvind Prabhakar commented on HIVE-1176: bq. Can you elaborate on what you mean by 'some collections were being fetched as semi-populated proxies with missing session context leading to NPEs'? Is there something I can do to reproduce this? @Paul: Here are the steps to reproduce this problem: # Startout with a clean workspace checkout and apply the updated patch HIVE-1176-2.patch. # Manually revert the file {{metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java}} to its previous state # run {{ant package}} from the root of the workspace # run {{ant test}} from within metastore You should see failures like the following: {code} [junit] testPartition() failed. [junit] java.lang.NullPointerException [junit] at org.datanucleus.store.mapped.scostore.AbstractMapStore.validateKeyForWriting(AbstractMapStore.java:333) [junit] at org.datanucleus.store.mapped.scostore.JoinMapStore.put(JoinMapStore.java:252) [junit] at org.datanucleus.sco.backed.Map.put(Map.java:640) [junit] at org.apache.hadoop.hive.metastore.api.Table.putToParameters(Table.java:359) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table(HiveMetaStore.java:1281) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:140) [junit] at org.apache.hadoop.hive.metastore.TestHiveMetaStore.testAlterTable(TestHiveMetaStore.java:728) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ... {code} If you look at {{src/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java}} you would notice that the line causing this exception should ideally be a {{HashMap}} and not an {{org.datanucleus.store.mapped.scostore.AbstractMapStore}} as indicated by the stack trace. This happens because the datanucleus JDO framework replaces collections with its own implementation in order to allow lazy-dereferencing and optimize for database connections/queries/memory consumption etc. Lazy loading of collections (and second class objects in general) can be disabled at a global level or at entity level. Disabling this globally is generally not recommended unless there is evidence backed by extensive testing that supports that change. Disabling at an entity level is still OK provided the entity object graph is fully dereferenced at all times. This could lead to extensive memory consumption in the system in case the entity graph is huge. My approach towards fixing the problem was to *not* change the default behavior in the general case. Instead I felt that it was better to circumvent this problem in the case of a remote metastore by creating a copy explicitly. If you have other suggestions on how to address this, please let me know. Also - more information on the lazy dereferencing mechanism used by datanucleus framework can be found [here|http://www.datanucleus.org/plugins/core/sco.html]. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) >
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879432#action_12879432 ] Arvind Prabhakar commented on HIVE-1176: The updated patch HIVE-1176-2.patch contains the following changes: # modified: build.properties # modified: build.xml # modified: eclipse-templates/.classpath # modified: ivy/ivysettings.xml # deleted:lib/datanucleus-core-1.1.2.LICENSE # deleted:lib/datanucleus-core-1.1.2.jar # deleted:lib/datanucleus-enhancer-1.1.2.LICENSE # deleted:lib/datanucleus-enhancer-1.1.2.jar # deleted:lib/datanucleus-rdbms-1.1.2.LICENSE # deleted:lib/datanucleus-rdbms-1.1.2.jar # deleted:lib/jdo2-api-2.3-SNAPSHOT.LICENSE # deleted:lib/jdo2-api-2.3-SNAPSHOT.jar # modified: metastore/ivy.xml # modified: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-2.patch Updating the patch with latest trunk image. This is necessary since HIVE-1373 updated the eclipse classpath with connection pool libraries which will be outdated with the application of this patch. The updated version of the patch takes care of this problem by updating eclipse classpath to use the updated libraries instead. Tested out launch configuration via eclipse to make sure it is working. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Status: Patch Available (was: Open) Submitting the regenerated patch with lastest trunk image. Patch file is HIVE-287-3.patch. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-3.patch > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877239#action_12877239 ] Arvind Prabhakar commented on HIVE-1139: Ashish - no problem - let me explain: The problem being addressed by this JIRA is that {{GroupByOperator}} and possibly other aggregation operators use in-memory maps to store intermediate keys, which could lead to {{OutOfMemoryException}} in case the number of such keys is large. It is suggested that one way to work around it is to use the {{HashMapWrapper}} class which would help alleviate the memory concern since it is capable of spilling the excess data to disk. The {{HashMapWrapper}} however, uses Java serialization to write out the excess data. This does not work when the data contains non-serializable objects such as {{Writable}} types - {{Text}} etc. What I have done so far is to modify the {{HashMapWrapper}} to support full {{java.util.Map}} interface. However, when I tried updating the {{GroupByOperator}} to use it, I ran into the said serialization problem. Thats why I was suggesting that perhaps we should decouple the serialization problem from enhancing the {{HashMapWrapper}} and let the later be checked independently. > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Ning Zhang >Assignee: Arvind Prabhakar > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877222#action_12877222 ] Arvind Prabhakar commented on HIVE-1139: If there is interest, I can file a separate JIRA for modifying {{HashMapWrapper}} to support the {{java.util.Map}} interface and decouple that work from this JIRA. I think there is a lot of benefit in doing just that. Also, we could have this JIRA depend upon that as a prerequisite. > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Ning Zhang >Assignee: Arvind Prabhakar > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876979#action_12876979 ] Arvind Prabhakar commented on HIVE-1139: I did some preliminary analysis for this JIRA and converted the {{HashMapWrapper}} to implement the {{java.util.Map}} interface. This required some changes all the way down to the underlying JDBM classes. However, this alone is not sufficient to plug it into the {{GroupByOperator}} implementation because the data stored in the {{HashMap}} is a mix of serializable Java objects as well as {{Writable}}s. Since {{Writable}}s cannot be directly serialized to Java, it follows that inorder to use this for fixing the memory problem we need _an external serialization_ mechanism that can handle arbitrary mixed type object graphs. A trivial approach to address this would be to implement custom serialization using Java reflection but that would incur cost of excessive reflection and byte handling/marshaling. If you have any other ideas regarding this, please add it to the comments of this issue for consideration. > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Ning Zhang >Assignee: Arvind Prabhakar > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876127#action_12876127 ] Arvind Prabhakar commented on HIVE-1176: I think the difference is more likely a bug in Mac OSX version of {{sed}}. Specifically, it fails to process directives with escaped tab sequence characters and instead treats it as unescaped. For example, the command to replace first occurrence of *b* in the string *abc* with a tab character *\t* fails as shown below: {code} $ echo "abc" | /usr/bin/sed "s...@b@\t@" atc {code} Whereas this works fine with the GNU distribution of sed {code} $ echo "abc" | /opt/local/bin/sed "s...@b@\t@" a c {code} > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Attachments: HIVE-1176-1.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875960#action_12875960 ] Arvind Prabhakar commented on HIVE-1176: John - I debugged the failures that I was seeing for input20 and input33 and it turns out to be a subtle difference in the way the stream editor {{sed}} works on Mac vs the regular linux distribution. I installed the GNU port for {{sed}} and the failures no longer occur. I don't think this is related to the sporadic failures that are reported on hudson. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Attachments: HIVE-1176-1.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-365) Create Table to support multiple levels of delimiters
[ https://issues.apache.org/jira/browse/HIVE-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875677#action_12875677 ] Arvind Prabhakar commented on HIVE-365: --- For the table "nested" as defined above, a row that contains the following data: [ [1,2,3],[10,20,30] ], { {foo:{1:1} }, {bar:{2,2} } } would be represented as: 1 \003 2 \003 3 \002 10 \003 20 \003 30 \001 foo \003 1 \004 1 \002 \bar \003 2 \004 2 note: spaces added for readability > Create Table to support multiple levels of delimiters > - > > Key: HIVE-365 > URL: https://issues.apache.org/jira/browse/HIVE-365 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao > > From HIVE-337, the SerDe layer now supports multiple-levels of delimiters, > for the purpose of supporting nested map/array/struct. > Array(the same as List) and struct consume a single level of separator, and > Map consumes 2 levels. > DDL (Create Table) needs to allow users to specify multiple levels of > delimiters in order to take the advantage of this new feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875650#action_12875650 ] Arvind Prabhakar commented on HIVE-1139: Soundararajan, Ning - Yes I am planning on working on it starting next week. I expect this to take at least upto mid to late in the week in order to get a patch available for this. However, if that schedule does not work for you, please feel free to take this issue into your queue and go ahead. It will be great if you could confirm it either way first. Arvind > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Ning Zhang >Assignee: Arvind Prabhakar > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
[ https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-802: -- Status: Patch Available (was: Open) A patch for HIVE-1176 has been submitted which addresses this problem by updating the datanucleus plugin as well as dependent libraries for Hive. Marking this JIRA as patch-submitted. > Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it > - > > Key: HIVE-802 > URL: https://issues.apache.org/jira/browse/HIVE-802 > Project: Hadoop Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Todd Lipcon >Assignee: Arvind Prabhakar > > There's a bug in DataNucleus that causes this issue: > http://www.jpox.org/servlet/jira/browse/NUCCORE-371 > To reproduce, simply put your hive source tree in a directory that contains a > '+' character. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873557#action_12873557 ] Arvind Prabhakar commented on HIVE-1176: Updated the patch so that it cleanly applies to the trunk. *Changes from Previous Patch:* * This patch uses ivy to download the updated datanucleus plugins and other dependent libraries. There is no need to use the previously supplied tar.gz anymore. * At the time the previous patch was written, the enhancer plugin version was 2.0.1 which by default would enable annotation processing. Since then version 2.0.3 has been released which [disables this behavior|http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-56]. Hence the previously submitted changes to {{javac.args}} in build.properties file are no longer necessary. This patch uses the updated version of the datanucleus enhancer plugin. * The JDO2 API library used by datanucleus plugin is distributed by the datanucleus's public maven repository. This repository has been added to ivy configuration to automate the download. * The connection pool libraries have been updated to work with the newer datanucleus plugins. Library {{commons-dbcp}} has been updated from 1.2.2 to 1.4, {{commons-pool}} from 1.2 to 1.5.4, and {{datanucleus-connectionpool}} from 1.0.2 to 2.0.1. * As with the previously submitted patch, {{HiveMetaStoreClient}} implementation has been modified to create deep-copies of non-primitive objects being returned from the thrift server in order to avoid semi-populated proxies causing NPEs downstream. *Testing Done:* Built and ran all tests. Only two failures were reported as before - clientpositive test for input20.q and input33.q. These tests appear to be failing on the trunk as well. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Attachments: HIVE-1176-1.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176-1.patch This patch replaces the patch submitted before (HIVE-1176.lib-files.tar.gz and HIVE-1176.patch). {code} HIVE-1176-1.patch: # modified: build.properties # modified: build.xml # modified: eclipse-templates/.classpath # modified: ivy/ivysettings.xml # deleted:lib/datanucleus-core-1.1.2.LICENSE # deleted:lib/datanucleus-core-1.1.2.jar # deleted:lib/datanucleus-enhancer-1.1.2.LICENSE # deleted:lib/datanucleus-enhancer-1.1.2.jar # deleted:lib/datanucleus-rdbms-1.1.2.LICENSE # deleted:lib/datanucleus-rdbms-1.1.2.jar # deleted:lib/jdo2-api-2.3-SNAPSHOT.LICENSE # deleted:lib/jdo2-api-2.3-SNAPSHOT.jar # modified: metastore/ivy.xml # modified: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java {code} > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Attachments: HIVE-1176-1.patch, HIVE-1176.lib-files.tar.gz, > HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-80) Allow Hive Server to run multiple queries simulteneously
[ https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-80: Assignee: Arvind Prabhakar (was: Neil Conway) > Allow Hive Server to run multiple queries simulteneously > > > Key: HIVE-80 > URL: https://issues.apache.org/jira/browse/HIVE-80 > Project: Hadoop Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Raghotham Murthy >Assignee: Arvind Prabhakar >Priority: Critical > Attachments: hive_input_format_race-2.patch > > > Can use one driver object per query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872296#action_12872296 ] Arvind Prabhakar commented on HIVE-1198: Ning - I have attached an updated patch - (hive-1198-2.patch). The key difference in this patch is that it does not activate checkstyle by default. When you import the hive project and you wish to activate checkstyle, you will have to right click on the project and select Checkstyle > Activate Checkstyle from the context menu. So in case checkstyle is causing problems on your workbench, you can choose to not activate it. The steps to activate checkstyle plugin in eclipse are also documented in the README.txt file, right below the section on setting up Eclipse. Can you give this patch a try and see if it resolves the problem you were facing? > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198-2.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1198: --- Attachment: HIVE-1198-2.patch > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198-2.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-80) Allow Hive Server to run multiple queries simulteneously
[ https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872282#action_12872282 ] Arvind Prabhakar commented on HIVE-80: -- This sounds like a good plan. If Neil is not actively working on this issue, I can move this to my queue and start working on it. > Allow Hive Server to run multiple queries simulteneously > > > Key: HIVE-80 > URL: https://issues.apache.org/jira/browse/HIVE-80 > Project: Hadoop Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Raghotham Murthy >Assignee: Neil Conway >Priority: Critical > Attachments: hive_input_format_race-2.patch > > > Can use one driver object per query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
[ https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-802: - Assignee: Arvind Prabhakar > Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it > - > > Key: HIVE-802 > URL: https://issues.apache.org/jira/browse/HIVE-802 > Project: Hadoop Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Todd Lipcon >Assignee: Arvind Prabhakar > > There's a bug in DataNucleus that causes this issue: > http://www.jpox.org/servlet/jira/browse/NUCCORE-371 > To reproduce, simply put your hive source tree in a directory that contains a > '+' character. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
[ https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872029#action_12872029 ] Arvind Prabhakar commented on HIVE-802: --- The patch submitted for HIVE-1176 would upgrade the data nucleus plugin to the latest stable version which does have a fix for this issue. > Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it > - > > Key: HIVE-802 > URL: https://issues.apache.org/jira/browse/HIVE-802 > Project: Hadoop Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Todd Lipcon > > There's a bug in DataNucleus that causes this issue: > http://www.jpox.org/servlet/jira/browse/NUCCORE-371 > To reproduce, simply put your hive source tree in a directory that contains a > '+' character. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871968#action_12871968 ] Arvind Prabhakar commented on HIVE-1198: I just installed freshly downloaded eclipse on Ubuntu desktop 9.10, with Java 1.6.0_20, checkstyle 5.1. The version of eclipse build is 20100218-1602 (latest galileo SR2 build). I was able to import the project in under 45 seconds. Since you are using version 3.6 of eclipse that is not yet released, perhaps that is why you are seeing this problem. Can you try reproducing this issue with a stable release of eclipse such as galileo SR2? > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871952#action_12871952 ] Arvind Prabhakar commented on HIVE-1198: I will try to reproduce this on a linux box and note any findings in the comments. > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871882#action_12871882 ] Arvind Prabhakar commented on HIVE-1198: I do not see any slow down Ning. I tested it just now and the project imports and builds under 40 seconds. Did you do the ant package, model-jar and gen-test before importing the project in eclipse? Without doing that, eclipse will not find the necessary classpath entires and that could lead to slow down. > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871830#action_12871830 ] Arvind Prabhakar commented on HIVE-1198: Updated the patch so that it cleanly applies to the trunk. It will be great to have this patch committed as it really helps in using eclipse effectively. > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1198: --- Attachment: HIVE-1198-1.patch > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198-1.patch, HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871494#action_12871494 ] Arvind Prabhakar commented on HIVE-287: --- Modified the implementation as per review feedback - Introduced an abstract base class and reverted the resolvers to extend that instead of directly implementing the new functionality. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch, HIVE-287-2.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Status: Patch Available (was: Open) > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch, HIVE-287-2.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-2.patch > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch, HIVE-287-2.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1179) Add UDF array_contains
[ https://issues.apache.org/jira/browse/HIVE-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871331#action_12871331 ] Arvind Prabhakar commented on HIVE-1179: I took a quick look at the various Operator implementations and found that the ones that store any evaluated expression results end up creating copies anyway - {{ObjectInspectorUtils.copyToStandardObject()}}. So although it appears that the system is working normally by reusing the object instance at the UDF level, code elsewhere in the system is forced to do the defensive copying. To clarify, my concern is not regarding a problem that may currently exist - but the potential problems that could occur due to not making defensive copies of mutable objects. If you are certain that this is does not apply to Hive implementation, then the updated patch should be fine for pushing in. > Add UDF array_contains > -- > > Key: HIVE-1179 > URL: https://issues.apache.org/jira/browse/HIVE-1179 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao >Assignee: Arvind Prabhakar > Attachments: HIVE-1179-1.patch, HIVE-1179-2.patch, HIVE-1179-3.patch, > HIVE-1179.patch > > > Returns true or false, depending on whether an element is in an array. > {{array_contains(T element, array theArray)}} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1179) Add UDF array_contains
[ https://issues.apache.org/jira/browse/HIVE-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1179: --- Status: Patch Available (was: Open) Paul - I have updated the patch. I do not have any examples of queries that will produce this failure today as no UDAF that can be applied to boolean input today does batch processing. My concern was primarily for creating defensive objects to guard against inadvertent mutation. > Add UDF array_contains > -- > > Key: HIVE-1179 > URL: https://issues.apache.org/jira/browse/HIVE-1179 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao >Assignee: Arvind Prabhakar > Attachments: HIVE-1179-1.patch, HIVE-1179-2.patch, HIVE-1179-3.patch, > HIVE-1179.patch > > > Returns true or false, depending on whether an element is in an array. > {{array_contains(T element, array theArray)}} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1179) Add UDF array_contains
[ https://issues.apache.org/jira/browse/HIVE-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1179: --- Attachment: HIVE-1179-3.patch > Add UDF array_contains > -- > > Key: HIVE-1179 > URL: https://issues.apache.org/jira/browse/HIVE-1179 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao >Assignee: Arvind Prabhakar > Attachments: HIVE-1179-1.patch, HIVE-1179-2.patch, HIVE-1179-3.patch, > HIVE-1179.patch > > > Returns true or false, depending on whether an element is in an array. > {{array_contains(T element, array theArray)}} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870860#action_12870860 ] Arvind Prabhakar commented on HIVE-287: --- Thanks for taking a look at this patch Namit. I have some questions and clarifcations regarding your feedback: bq. 1. This should be independent of COUNT - so, all basically all aggregation functions should be supported with DISTINCT. For eg: select avg(distinct c1,c2) from T Not sure how this relates to the change I made. Even before making this change, the DISTINCT qualifier was allowed for any function invocation. Can you elaborate what you mean by this? Specifically, which part of the patch needs to be changed in order to accomodate this request. bq. 2. It would be a good idea to maintain some compatibility for the existing interface - so, can we add another method to UDAFResolver, which has the new API - and a common class which invokes the default implementation, that would be better. Here is what I understand your suggestion as: Add a new method to GenericUDAFResolver interface maintaining the old method. Create an abstract base class that implements the new interface method and invokes the old method by dropping isDistinct/isAllColumn arguments. Extend the current resolvers to override this method. Will this address your concern? If not, can you provide a concrete example. bq. 3. Follows from 1 - more tests are needed Are you suggesting more tests for array_contains UDF or to add more tests for other UDFs? Please clarify with examples if possible. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1179) Add UDF array_contains
[ https://issues.apache.org/jira/browse/HIVE-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870475#action_12870475 ] Arvind Prabhakar commented on HIVE-1179: bq. One minor point - can you make result a member variable of GenericUDFArrayContains? This will reduce object creation. While this will reduce object creation, it will cause correctness problems when this UDF is used in an aggregate operation. Using a member variable for {{result}} would then mean that all values of aggregated output will always reflect the evaluated value of the last row. A similar problem would occur if there is a lag between collecting and processing of output values. Hence my preference is to keep the implementation as is (stateless). If you still would like to make it a member variable, please let me know and I can make that change. > Add UDF array_contains > -- > > Key: HIVE-1179 > URL: https://issues.apache.org/jira/browse/HIVE-1179 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao >Assignee: Arvind Prabhakar > Attachments: HIVE-1179-1.patch, HIVE-1179-2.patch, HIVE-1179.patch > > > Returns true or false, depending on whether an element is in an array. > {{array_contains(T element, array theArray)}} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1179) Add UDF array_contains
[ https://issues.apache.org/jira/browse/HIVE-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1179: --- Attachment: HIVE-1179-2.patch Updated the patch to work with current trunk. > Add UDF array_contains > -- > > Key: HIVE-1179 > URL: https://issues.apache.org/jira/browse/HIVE-1179 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Zheng Shao >Assignee: Arvind Prabhakar > Attachments: HIVE-1179-1.patch, HIVE-1179-2.patch, HIVE-1179.patch > > > Returns true or false, depending on whether an element is in an array. > {{array_contains(T element, array theArray)}} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1198) When checkstyle is activated for Hive in Eclipse environment, it shows all checkstyle problems as errors.
[ https://issues.apache.org/jira/browse/HIVE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-1198: -- Assignee: Arvind Prabhakar > When checkstyle is activated for Hive in Eclipse environment, it shows all > checkstyle problems as errors. > - > > Key: HIVE-1198 > URL: https://issues.apache.org/jira/browse/HIVE-1198 > Project: Hadoop Hive > Issue Type: Improvement > Components: Build Infrastructure > Environment: Mac OS X (10.6.2), Eclipse 3.5.1.R35, Checkstyle Plugin > 5.1.0.201002232103 (latest eclipse and checkstyle build as of 02/2010) >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar >Priority: Minor > Attachments: HIVE-1198.patch > > > As of now, checkstyle plugin reports all problems as errors. This causes an > overwhelming number of errors to show up (3000+) which masks real errors that > might be there. Since all the checkstyle violations are not going to be fixed > in one shot, it is desirable to lower the severity of checkstyle violations > to warnings so that the plugin can be kept enabled. This will encourage > developers to spot checkstyle violations in the files they touch and > potentially fix them as they go along, along with pointing out violations as > they code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Status: Patch Available (was: Open) Fix Version/s: 0.6.0 *Summary* This patch fixes the {{count()}} aggregate function to be consistent with SQL. Specifically: * Provides support for {{SELECT count(*) FROM table}} queries, where it returns the total number of rows of the table. * Also extended the support for {{count()}} to include multiple expression list. {{count(DISTINCT expr1, exp2,...)}} returns the number of non-NULL and different valued rows from the evaluated expressions. *Details* * Modified HiveQL grammar to allow function invocation with a single * in place of parameter list. * Propagated the presence of * as parameter or specification of {{DISTINCT}} keyword in the UDF resolver framework so that it can be used by UDFs that behave differently when these are applicable. * Modified the {{count()}} UDAF to support the same semantics of handling NULL values as regular SQL. * Added test case to specifically exercise the newly introduced semantics of the count UDAF. *Testing* Ran all tests. Noted only two failures (input20.q, input33.q) which were found to be failing on the local trunk image as well. If and when this patch is committed to the trunk, I will go ahead and update the Hive Wiki with details and examples regarding the use of {{count()}} UDAF in various forms. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-1.patch > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-1139: -- Assignee: Arvind Prabhakar (was: Ning Zhang) > GroupByOperator sometimes throws OutOfMemory error when there are too many > distinct keys > > > Key: HIVE-1139 > URL: https://issues.apache.org/jira/browse/HIVE-1139 > Project: Hadoop Hive > Issue Type: Bug >Reporter: Ning Zhang >Assignee: Arvind Prabhakar > > When a partial aggregation performed on a mapper, a HashMap is created to > keep all distinct keys in main memory. This could leads to OOM exception when > there are too many distinct keys for a particular mapper. A workaround is to > set the map split size smaller so that each mapper takes less number of rows. > A better solution is to use the persistent HashMapWrapper (currently used in > CommonJoinOperator) to spill overflow rows to disk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-287: - Assignee: Arvind Prabhakar > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1029) typedbytes does not support nulls
[ https://issues.apache.org/jira/browse/HIVE-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1029: --- Status: Patch Available (was: Open) This patch adds support for NULL types in TypedBytesSerDe. > typedbytes does not support nulls > - > > Key: HIVE-1029 > URL: https://issues.apache.org/jira/browse/HIVE-1029 > Project: Hadoop Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1029.patch > > > typedbytes does not support nulls -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1029) typedbytes does not support nulls
[ https://issues.apache.org/jira/browse/HIVE-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1029: --- Attachment: HIVE-1029.patch > typedbytes does not support nulls > - > > Key: HIVE-1029 > URL: https://issues.apache.org/jira/browse/HIVE-1029 > Project: Hadoop Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1029.patch > > > typedbytes does not support nulls -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1029) typedbytes does not support nulls
[ https://issues.apache.org/jira/browse/HIVE-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar reassigned HIVE-1029: -- Assignee: Arvind Prabhakar > typedbytes does not support nulls > - > > Key: HIVE-1029 > URL: https://issues.apache.org/jira/browse/HIVE-1029 > Project: Hadoop Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > > typedbytes does not support nulls -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1345) TypedBytesSerDe fails to create table with multiple columns.
[ https://issues.apache.org/jira/browse/HIVE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1345: --- Status: Patch Available (was: Open) The problem was due to incorrect parsing of the {{columnTypeProperty}} during the initialization of {{TypedBytesSerDe}}. This patch fixes the problem by delegating the parsing logic to the standard routine used by other SerDes - {{TypeInfoUtils.getTypeInfosFromTypeString()}}. Also included in this patch is a test case that exercises this change and validates that multi-column tables can be created when using this SerDe. > TypedBytesSerDe fails to create table with multiple columns. > > > Key: HIVE-1345 > URL: https://issues.apache.org/jira/browse/HIVE-1345 > Project: Hadoop Hive > Issue Type: Bug > Components: Contrib >Affects Versions: 0.5.0 > Environment: JDK 6 (1.6.0_17) on Mac OSX 10.6.3, Hadoop 0.20.2, Hive > 0.5.0 >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1345-1.patch > > > Creating a table with more than one columns fails when the row format SerDe > is TypedBytesSerDe. > {code} > hive> CREATE TABLE test (a STRING, b STRING) ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'; > Found class for org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe > > FAILED: Error in metadata: java.lang.IndexOutOfBoundsException: Index: 1, > Size: 1 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > > hive> > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1345) TypedBytesSerDe fails to create table with multiple columns.
[ https://issues.apache.org/jira/browse/HIVE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1345: --- Attachment: HIVE-1345-1.patch > TypedBytesSerDe fails to create table with multiple columns. > > > Key: HIVE-1345 > URL: https://issues.apache.org/jira/browse/HIVE-1345 > Project: Hadoop Hive > Issue Type: Bug > Components: Contrib >Affects Versions: 0.5.0 > Environment: JDK 6 (1.6.0_17) on Mac OSX 10.6.3, Hadoop 0.20.2, Hive > 0.5.0 >Reporter: Arvind Prabhakar >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1345-1.patch > > > Creating a table with more than one columns fails when the row format SerDe > is TypedBytesSerDe. > {code} > hive> CREATE TABLE test (a STRING, b STRING) ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'; > Found class for org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe > > FAILED: Error in metadata: java.lang.IndexOutOfBoundsException: Index: 1, > Size: 1 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > > hive> > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1345) TypedBytesSerDe fails to create table with multiple columns.
TypedBytesSerDe fails to create table with multiple columns. Key: HIVE-1345 URL: https://issues.apache.org/jira/browse/HIVE-1345 Project: Hadoop Hive Issue Type: Bug Components: Contrib Affects Versions: 0.5.0 Environment: JDK 6 (1.6.0_17) on Mac OSX 10.6.3, Hadoop 0.20.2, Hive 0.5.0 Reporter: Arvind Prabhakar Assignee: Arvind Prabhakar Fix For: 0.6.0 Creating a table with more than one columns fails when the row format SerDe is TypedBytesSerDe. {code} hive> CREATE TABLE test (a STRING, b STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'; Found class for org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe FAILED: Error in metadata: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask hive> {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-80) Allow Hive Server to run multiple queries simulteneously
[ https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867693#action_12867693 ] Arvind Prabhakar commented on HIVE-80: -- I wanted to fix this JIRA and so started looking at it. From what I have observed it appears that the {{HiveServer}} *is* multi-thread capable. Specifically: * The {{HiveServer}} is using a {{TThreadPoolServer}} which is multi-threaded. * The {{ThriftHiveProcessorFactory}} overrides the {{getProcessor()}} call and returns a new instance of {{HiveServerHandler}} on every invokation. * Every instance of {{HiveServerHandler}} has its own thread local session state and a private driver instance. * Query execution is thread safe thanks to HIVE-77. Give the above, I believe that this JIRA should be marked closed and resolved. If you think I missed something in my analysis, can you please point that out? > Allow Hive Server to run multiple queries simulteneously > > > Key: HIVE-80 > URL: https://issues.apache.org/jira/browse/HIVE-80 > Project: Hadoop Hive > Issue Type: Improvement > Components: Server Infrastructure >Reporter: Raghotham Murthy >Assignee: Neil Conway >Priority: Critical > Attachments: hive_input_format_race-2.patch > > > Can use one driver object per query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Status: Patch Available (was: Open) Assignee: Arvind Prabhakar This problem is due to [a bug in Datanucleus JDOQL|http://www.jpox.org/servlet/jira/browse/NUCCORE-427] implementation and has been fixed in version 2.0.x. The fix is therefore to upgrade datanucleus plugins to the latest stable release. *Details:* - Replaced the old datanucleus plugins version 1.1.2 with the latest stable release. - Updated jdo2-api library with the version required by datanucleus - 2.3-0ec, available from datanucleus maven repository at http://www.datanucleus.org/downloads/maven2/javax/jdo/jdo2-api/2.3-ec/, Apache licensed - Modified the build files to suppress auto-enhancement of all complied classes, a new feature introduced in the latest version. - Modified the HiveMetaStoreClient implementation to create deep-copies of non-primitive objects being returned from the thrift server. Without this change, some collections were being fetched as semi-populated proxies with missing session context leading to NPEs. *Testing Done:* Built and ran all tests. Only two failures were reported - clientpositive test for input20.q and input33.q. These tests appear to be failing on the trunk as well. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Attachments: HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1176: --- Attachment: HIVE-1176.patch HIVE-1176.lib-files.tar.gz > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka > Attachments: HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1287) Struct datatype should not use field names for type equivalence.
[ https://issues.apache.org/jira/browse/HIVE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851166#action_12851166 ] Arvind Prabhakar commented on HIVE-1287: I think I understand your point of view. Let me explain mine: Right now there is no consistent type checking. What we have is implicit type conversion where possible - such as converting a struct to string but not the other way around. In other places this implicit type conversion leads to internal error. In case of struct to struct conversion however the check is rigid to the field names. This is not consistent. My suggestion is to provide type equivalence semantics within the query language framework. Doing this will help in the following ways: - Implicit type conversion would not be allowed and would require explicit CAST to convert to another type. - The query compiler would ensure that the data types are equivalent and therefore allow data to flow without having to invoke any UDF for every row. This should help us gain performance relative to the current approach. - Providing type equivalence checks will also be fundamental to building higher-level UD*Fs which would otherwise have to deal with cast semantics. > Struct datatype should not use field names for type equivalence. > > > Key: HIVE-1287 > URL: https://issues.apache.org/jira/browse/HIVE-1287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor > Environment: Mac OS X (10.6.2) Java SE 6 ( 1.6.0_17) >Reporter: Arvind Prabhakar > > The field names for {{Struct}} types are currently being matched for testing > type equivalence. This is readily seen by running the following example: > {noformat} > hive> create table source ( foo struct < x : string > ); > OK > Time taken: 3.094 seconds > hive> load data local inpath '/path/to/sample/data.txt' overwrite into table > source; > Copying data from file:/path/to/sample/data.txt > Loading data to table source > OK > Time taken: 0.593 seconds > hive> create table sink ( bar struct < y : string >); > OK > Time taken: 0.11 seconds > hive> insert overwrite table sink select foo from source; > FAILED: Error in semantic analysis: line 1:23 Cannot insert into target table > because column number/types are different sink: Cannot convert column 0 > from struct to struct. > {noformat} > Since both {{soruce.foo}} and {{sink.bar}} are similar in definition with > only field names being different, data movement between these two should be > allowed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1287) Struct datatype should not use field names for type equivalence.
[ https://issues.apache.org/jira/browse/HIVE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851097#action_12851097 ] Arvind Prabhakar commented on HIVE-1287: Thanks for your comment Zheng. I can see how the {{CAST}} would work, but believe that we need a stronger type checking semantic. Traditionally, a {{CAST}} is used to bypass compile time checks. While this is very powerful concept, it can lead to data corrpution if not used with caution. An alternative to using the {{CAST}} approach would be to use compile time type checking without regard to the field names. This is similar to function signatures in say Java - where it does not matter what the parameter names are, as long as they are specified in the correct order. This can be achieved by thinking of field names as aliases for the datatypes of that field. For example - the columns defined as {{struct < a : string >}} and {{struct < b : string >}} are type-equivalent because they are both of the type {{struct < ? : string >}}. > Struct datatype should not use field names for type equivalence. > > > Key: HIVE-1287 > URL: https://issues.apache.org/jira/browse/HIVE-1287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor > Environment: Mac OS X (10.6.2) Java SE 6 ( 1.6.0_17) >Reporter: Arvind Prabhakar > > The field names for {{Struct}} types are currently being matched for testing > type equivalence. This is readily seen by running the following example: > {noformat} > hive> create table source ( foo struct < x : string > ); > OK > Time taken: 3.094 seconds > hive> load data local inpath '/path/to/sample/data.txt' overwrite into table > source; > Copying data from file:/path/to/sample/data.txt > Loading data to table source > OK > Time taken: 0.593 seconds > hive> create table sink ( bar struct < y : string >); > OK > Time taken: 0.11 seconds > hive> insert overwrite table sink select foo from source; > FAILED: Error in semantic analysis: line 1:23 Cannot insert into target table > because column number/types are different sink: Cannot convert column 0 > from struct to struct. > {noformat} > Since both {{soruce.foo}} and {{sink.bar}} are similar in definition with > only field names being different, data movement between these two should be > allowed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1287) Struct datatype should not use field names for type equivalence.
Struct datatype should not use field names for type equivalence. Key: HIVE-1287 URL: https://issues.apache.org/jira/browse/HIVE-1287 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Environment: Mac OS X (10.6.2) Java SE 6 ( 1.6.0_17) Reporter: Arvind Prabhakar The field names for {{Struct}} types are currently being matched for testing type equivalence. This is readily seen by running the following example: {noformat} hive> create table source ( foo struct < x : string > ); OK Time taken: 3.094 seconds hive> load data local inpath '/path/to/sample/data.txt' overwrite into table source; Copying data from file:/path/to/sample/data.txt Loading data to table source OK Time taken: 0.593 seconds hive> create table sink ( bar struct < y : string >); OK Time taken: 0.11 seconds hive> insert overwrite table sink select foo from source; FAILED: Error in semantic analysis: line 1:23 Cannot insert into target table because column number/types are different sink: Cannot convert column 0 from struct to struct. {noformat} Since both {{soruce.foo}} and {{sink.bar}} are similar in definition with only field names being different, data movement between these two should be allowed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1271: --- Attachment: HIVE-1271-1.patch > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850785#action_12850785 ] Arvind Prabhakar commented on HIVE-1271: Changes for HIVE-1271 (patch updated) *Summary:* The previously submitted patch removed the dependence of {{StructTypeInfo}} on field names for equivalence comparison. This patch reverts that change and addresses the type equivalence by canonical testing of field names. *Details:* The changes to {{TypeInfo}} hierarchy made by previous patch assumed that the field names should not be considered part of the {{StructTypeInfo}} for testing equivalence. This conflicts with the implementation of {{LazyBinarySerDe}} (and others perhaps) which rely on field name distinction for caching purposes. This update changes the implementation so that field names are used as before, but are compared using case-insensitive comparison when testing the equivalence of two {{StructTypeInfo}}s. *Testing Done:* - Built and tested the usecase identified in this issue - it works now. - Ran complete set of tests with the previously reported unrelated failures only. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1271: --- Status: Patch Available (was: Open) *Summary* The implementation of {{equals()}} method of {{StructTypeInfo}} was comparing field names as part of the comparison. This is not valid since field namess do not contitute the definition of a type. This patch refactors the {{TypedInfo}} hierarchy to address this issue. *Implementation Details* - Modified the {{TypedInfo}} and removed its implementation of the {{equals()}} method. - Modified all specialized subclasses to make them {{final}}. - Modified all subclass implementation of {{equals()}} to skip category comparison. - Modified {{StructTypeInfo}} implementation of {{equals()}} to not compare field names. *Testing Done* - Built and tested the usecase identified in this issue. It works now. - Ran full set of tests. Out of these two tests - clientpositive for input20.q and input33.q failed for unrelated reasons (these tests are failing on the trunk as well). All other tests passed. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Attachments: HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-1271: --- Attachment: HIVE-1271.patch > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Attachments: HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.