[jira] [Created] (HIVE-5722) Skip generating vectorization code if possible
Navis created HIVE-5722: --- Summary: Skip generating vectorization code if possible Key: HIVE-5722 URL: https://issues.apache.org/jira/browse/HIVE-5722 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Currently, ql module always generates new vectorization code, which might not be changed so frequently. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5722) Skip generating vectorization code if possible
[ https://issues.apache.org/jira/browse/HIVE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5722: Description: NO PRECOMMIT TESTS Currently, ql module always generates new vectorization code, which might not be changed so frequently. was:Currently, ql module always generates new vectorization code, which might not be changed so frequently. Skip generating vectorization code if possible -- Key: HIVE-5722 URL: https://issues.apache.org/jira/browse/HIVE-5722 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Minor NO PRECOMMIT TESTS Currently, ql module always generates new vectorization code, which might not be changed so frequently. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5722) Skip generating vectorization code if possible
[ https://issues.apache.org/jira/browse/HIVE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5722: Status: Patch Available (was: Open) Skip generating vectorization code if possible -- Key: HIVE-5722 URL: https://issues.apache.org/jira/browse/HIVE-5722 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-5722.1.patch.txt NO PRECOMMIT TESTS Currently, ql module always generates new vectorization code, which might not be changed so frequently. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5722) Skip generating vectorization code if possible
[ https://issues.apache.org/jira/browse/HIVE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5722: Attachment: HIVE-5722.1.patch.txt Skip generating vectorization code if possible -- Key: HIVE-5722 URL: https://issues.apache.org/jira/browse/HIVE-5722 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-5722.1.patch.txt NO PRECOMMIT TESTS Currently, ql module always generates new vectorization code, which might not be changed so frequently. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811073#comment-13811073 ] Hive QA commented on HIVE-4523: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611459/HIVE-4523.5.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4550 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_round {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/97/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/97/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4523: -- Attachment: HIVE-4523.6.patch Patch #6 fixed the reported test failure. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5191) Add char data type
[ https://issues.apache.org/jira/browse/HIVE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811097#comment-13811097 ] Hive QA commented on HIVE-5191: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611486/HIVE-5191.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4572 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_union1 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/98/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/98/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. Add char data type -- Key: HIVE-5191 URL: https://issues.apache.org/jira/browse/HIVE-5191 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5191.1.patch, HIVE-5191.2.patch, HIVE-5191.3.patch Separate task for char type, since HIVE-4844 only adds varchar -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5707) Validate values for ConfVar
[ https://issues.apache.org/jira/browse/HIVE-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-5707: -- Attachment: D13821.2.patch navis updated the revision HIVE-5707 [jira] Validate values for ConfVar. Fixed orc_create (cannot reproduce fail of bucket_num_reducers) Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D13821 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D13821?vs=42753id=42861#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java ql/src/test/queries/clientnegative/set_hiveconf_validation2.q ql/src/test/queries/clientpositive/orc_create.q ql/src/test/results/clientnegative/set_hiveconf_validation2.q.out To: JIRA, navis Validate values for ConfVar --- Key: HIVE-5707 URL: https://issues.apache.org/jira/browse/HIVE-5707 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13821.1.patch, D13821.2.patch with set hive.conf.validation=true, hive validates new value can be changed to the type. But it does not check value itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Bug in map join optimization causing OutOfMemory error
Hey Folks, Could you please take a look at the below problem. We are hitting OutOfMemoryErrors while joining tables that are not managed by Hive. Would appreciate any feedback. Thanks Mehant On 10/7/13 12:04 PM, Mehant Baid wrote: Hey Folks, We are using hive-0.11 and are hitting java.lang.OutOfMemoryError. The problem seems to be in CommonJoinResolver.java (processCurrentTask()), in this function we try and convert a map-reduce join to a map join if 'n-1' tables involved in a 'n' way join have a size below a certain threshold. If the tables are maintained by hive then we have accurate sizes of each table and can apply this optimization but if the tables are created using storage handlers, HBaseStorageHanlder in our case then the size is set to be zero. Due to this we assume that we can apply the optimization and convert the map-reduce join to a map join. So we build a in-memory hash table for all the keys, since our table created using the storage handler is large, it does not fit in memory and we hit the error. Should I open a JIRA for this? One way to fix this is to set the size of the table (created using storage handler) to be equal to the map join threshold. This way the table would be selected as the big table and we can proceed with the optimization if other tables in the join have size below the threshold. If we have multiple big tables then the optimization would be turned off. Thanks Mehant
[jira] [Commented] (HIVE-5715) HS2 should not start a session for every command
[ https://issues.apache.org/jira/browse/HIVE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811106#comment-13811106 ] Hive QA commented on HIVE-5715: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611493/HIVE-5715.2.patch {color:green}SUCCESS:{color} +1 4547 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/99/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/99/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. HS2 should not start a session for every command -- Key: HIVE-5715 URL: https://issues.apache.org/jira/browse/HIVE-5715 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-5715.1.patch, HIVE-5715.2.patch HS2 calls SessionState.start multiple times (acquire, operation.run) - where it really just cares that the session is set in the thread local store. There are some calls in start session method now that preload stuff that's used during the session. Instead of doing that over and over again, I think it'd be nicer for HS2 to start a session once and then just do the thread local magic as needed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5583) Implement support for IN (list-of-constants) filter in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1386#comment-1386 ] Hive QA commented on HIVE-5583: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611487/HIVE-5583.3.patch {color:green}SUCCESS:{color} +1 4556 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/100/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/100/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Implement support for IN (list-of-constants) filter in vectorized mode -- Key: HIVE-5583 URL: https://issues.apache.org/jira/browse/HIVE-5583 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-5583.1.patch.txt, HIVE-5583.3.patch Implement optimized, vectorized support for filters of this form: column IN (constant1, ... constantN) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5723) Add dependency plugin
Navis created HIVE-5723: --- Summary: Add dependency plugin Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5723) Add dependency plugin
[ https://issues.apache.org/jira/browse/HIVE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5723: Description: NO PRECOMMIT TESTS For easy gathering of required libraries. was:For easy gathering of required libraries. Add dependency plugin - Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt NO PRECOMMIT TESTS For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5723) Add dependency plugin
[ https://issues.apache.org/jira/browse/HIVE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5723: Status: Patch Available (was: Open) Add dependency plugin - Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt NO PRECOMMIT TESTS For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5723) Add dependency plugin
[ https://issues.apache.org/jira/browse/HIVE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5723: Attachment: HIVE-5723.1.patch.txt Add dependency plugin - Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt NO PRECOMMIT TESTS For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5712) Hive Java error when using a view with a sub-query from inside another sub-query and including non-views in the FROM clause
[ https://issues.apache.org/jira/browse/HIVE-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811120#comment-13811120 ] Navis commented on HIVE-5712: - java.lang.ClassNotFoundException: com.microsoft.log4jappender.FilterLogAppender seemed caused this failure. Hive Java error when using a view with a sub-query from inside another sub-query and including non-views in the FROM clause --- Key: HIVE-5712 URL: https://issues.apache.org/jira/browse/HIVE-5712 Project: Hive Issue Type: Bug Environment: Windows Azure HDInsight Reporter: Ian Beckett Priority: Minor Labels: sub-query, views Here is a gist: https://gist.github.com/ianbeckett/7254214 Hive throws java errors if you try to reference a view which includes sub-queries from within a sub-query of another query. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3990) Provide input threshold for direct-fetcher (HIVE-2925)
[ https://issues.apache.org/jira/browse/HIVE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811122#comment-13811122 ] Navis commented on HIVE-3990: - Sure. HIVE-5718 also might be useful. Before that I should be accustomed to maven env. Provide input threshold for direct-fetcher (HIVE-2925) -- Key: HIVE-3990 URL: https://issues.apache.org/jira/browse/HIVE-3990 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3990.D8415.1.patch As a followup of HIVE-2925, add input threshold for fetch task conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5718: Status: Patch Available (was: Open) Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-5718: -- Attachment: D13857.1.patch navis requested code review of HIVE-5718 [jira] Support direct fetch for lateral views, sub queries, etc.. Reviewers: JIRA HIVE-5718 Support direct fetch for lateral views, sub queries, etc. Extend HIVE-2925 with LV and SubQ. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D13857 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java ql/src/test/queries/clientpositive/nonmr_fetch.q ql/src/test/results/clientpositive/lateral_view_noalias.q.out ql/src/test/results/clientpositive/nonmr_fetch.q.out ql/src/test/results/clientpositive/udf_explode.q.out ql/src/test/results/clientpositive/udtf_explode.q.out WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/42153/ To: JIRA, navis Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5715) HS2 should not start a session for every command
[ https://issues.apache.org/jira/browse/HIVE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811126#comment-13811126 ] Hive QA commented on HIVE-5715: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611493/HIVE-5715.2.patch {color:green}SUCCESS:{color} +1 4547 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/101/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/101/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. HS2 should not start a session for every command -- Key: HIVE-5715 URL: https://issues.apache.org/jira/browse/HIVE-5715 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-5715.1.patch, HIVE-5715.2.patch HS2 calls SessionState.start multiple times (acquire, operation.run) - where it really just cares that the session is set in the thread local store. There are some calls in start session method now that preload stuff that's used during the session. Instead of doing that over and over again, I think it'd be nicer for HS2 to start a session once and then just do the thread local magic as needed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments
[ https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811134#comment-13811134 ] Teddy Choi commented on HIVE-5581: -- Eric, I will separate classes, and also make end-to-end tests. It seems like there are many to create (VectorUDFYearString, VectorUDFMonthString, VectorUDFDayString, ...). Teddy Implement vectorized year/month/day... etc. for string arguments Key: HIVE-5581 URL: https://issues.apache.org/jira/browse/HIVE-5581 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5581.1.patch.txt Functions year(), month(), day(), weekofyear(), hour(), minute(), second() need to be implemented for string arguments in vectorized mode. They already work for timestamp arguments. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3190) allow INTEGER as a type name in a column/cast expression (per ISO-SQL 2011)
[ https://issues.apache.org/jira/browse/HIVE-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811145#comment-13811145 ] Hive QA commented on HIVE-3190: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611495/HIVE-3190.3.patch {color:green}SUCCESS:{color} +1 4548 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/102/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/102/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. allow INTEGER as a type name in a column/cast expression (per ISO-SQL 2011) --- Key: HIVE-3190 URL: https://issues.apache.org/jira/browse/HIVE-3190 Project: Hive Issue Type: Improvement Components: SQL Affects Versions: 0.8.0 Reporter: N Campbell Assignee: Jason Dere Attachments: HIVE-3190.1.patch, HIVE-3190.2.patch, HIVE-3190.3.patch Just extend the parser to allow INTEGER instead of making folks use INT select cast('10' as integer) from cert.tversion tversion FAILED: Parse Error: line 1:20 cannot recognize input near 'integer' ')' 'from' in primitive type specification -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811147#comment-13811147 ] Hive QA commented on HIVE-5230: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611504/HIVE-5230.2.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/103/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/103/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-103/source-prep.txt + [[ true == \t\r\u\e ]] + rm -rf ivy maven + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/test/results/clientpositive/show_functions.q.out' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/type_aliases.q.out ql/src/test/queries/clientpositive/type_aliases.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1537880. At revision 1537880. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. Better error reporting by async threads in HiveServer2 -- Key: HIVE-5230 URL: https://issues.apache.org/jira/browse/HIVE-5230 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, HIVE-5230.2.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-5611: Attachment: HIVE-5611.patch Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5692) Make VectorGroupByOperator parameters configurable
[ https://issues.apache.org/jira/browse/HIVE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-5692: --- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-4160) Make VectorGroupByOperator parameters configurable -- Key: HIVE-5692 URL: https://issues.apache.org/jira/browse/HIVE-5692 Project: Hive Issue Type: Bug Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Attachments: HIVE-5692.1.patch, HIVE-5692.2.patch, HIVE-5692.3.patch The FLUSH_CHECK_THRESHOLD and PERCENT_ENTRIES_TO_FLUSH should be configurable. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811172#comment-13811172 ] Hive QA commented on HIVE-3959: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611523/HIVE-3959.6.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4547 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/104/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/104/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Ashutosh Chauhan Priority: Minor Attachments: HIVE-3959.1.patch, HIVE-3959.2.patch, HIVE-3959.3.patch, HIVE-3959.3.patch, HIVE-3959.4.patch, HIVE-3959.4.patch, HIVE-3959.5.patch, HIVE-3959.6.patch, HIVE-3959.patch.1, HIVE-3959.patch.11.txt, HIVE-3959.patch.12.txt, HIVE-3959.patch.2 When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811176#comment-13811176 ] Szehon Ho commented on HIVE-5611: - I have a partial patch that supports basic Hive functionalities. This patch converts all the Ant tar/bin steps into Mvn assembly src/bin descriptors. All the non-jar files should now be in the Mvn tarball, with same directory location as in Ant tarballs, with directory location exactly the same. But some jars are not yet complete and need further investigation. In particular, the Hcatalog sub-project had many custom Ant steps in /hcatalog/build.xml in creating its unique directory structure, I think it will be a big effort to replicate the same in Mvn. Issues (non-blockers) that I encountered between Ant and Mvn assemblies: 1. In src assembly, the prefixes for the .java files are different. This is because in ant, they copied over the project name. In mvn, I am using the moduleSet, and module names are slightly different than folder name. I don't believe this would be a big issue. 2. The java-docs are not included in src assembly, as they are not generated. This is already tracked in HIVE-5717. Issues in running Hive so far (very basic testing), there might be more: 3. The version number is not displaying correctly on starting Beeline, because MANIFEST is not included in jar built by Beeline project. It is not related to assembly creation. 4. There is background NPE in starting SQLCompletor, because the resource file sql-keyword.properties has been moved to src/main/resources from org/apache/hive/beeline. Again it is not related to assembly creation. HCatalog (must fix) To add the hcatalog jars in the correct directory location would take a big effort, I might not be able to complete this task myself, I would appreciate if someone could help take a look! Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811189#comment-13811189 ] Hive QA commented on HIVE-3959: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611523/HIVE-3959.6.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4547 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_smb_mapjoin_8 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/105/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/105/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Ashutosh Chauhan Priority: Minor Attachments: HIVE-3959.1.patch, HIVE-3959.2.patch, HIVE-3959.3.patch, HIVE-3959.3.patch, HIVE-3959.4.patch, HIVE-3959.4.patch, HIVE-3959.5.patch, HIVE-3959.6.patch, HIVE-3959.patch.1, HIVE-3959.patch.11.txt, HIVE-3959.patch.12.txt, HIVE-3959.patch.2 When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811212#comment-13811212 ] Hive QA commented on HIVE-4523: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611562/HIVE-4523.6.patch {color:green}SUCCESS:{color} +1 4550 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/108/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/108/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5724) Java API always returning null
Kassem Tohme created HIVE-5724: -- Summary: Java API always returning null Key: HIVE-5724 URL: https://issues.apache.org/jira/browse/HIVE-5724 Project: Hive Issue Type: Bug Reporter: Kassem Tohme -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5724) Java API: comments are always null
[ https://issues.apache.org/jira/browse/HIVE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kassem Tohme updated HIVE-5724: --- Description: Hey I'm new to jira (just to let u know) Following scenario: Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema, null); } else if (Category.LIST == typeCategory) { HCatSchema subSchema = getHCatSchema(((ListTypeInfo) fieldTypeInfo).getListElementTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.ARRAY, subSchema, null); } else if (Category.MAP == typeCategory) { HCatFieldSchema.Type mapKeyType = getPrimitiveHType(((MapTypeInfo) fieldTypeInfo).getMapKeyTypeInfo()); HCatSchema subSchema = getHCatSchema(((MapTypeInfo) fieldTypeInfo).getMapValueTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.MAP, mapKeyType, subSchema, null); } else { throw new TypeNotPresentException(fieldTypeInfo.getTypeName(), null); } return hCatFieldSchema; } Additional parameter FieldSchema in getHCatFieldSchema(String, TypeInfo) and replacing null with fs.getComment() should fix it. This bug also impacts on HCatalogs Java API. Summary: Java API: comments are always null (was: Java API always returning null ) Java API: comments are always null -- Key: HIVE-5724 URL: https://issues.apache.org/jira/browse/HIVE-5724 Project: Hive Issue Type: Bug Reporter: Kassem Tohme Hey I'm new to jira (just to let u know) Following scenario: Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) {
[jira] [Updated] (HIVE-5724) Java API: comments are always null
[ https://issues.apache.org/jira/browse/HIVE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kassem Tohme updated HIVE-5724: --- Description: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema, null); } else if (Category.LIST == typeCategory) { HCatSchema subSchema = getHCatSchema(((ListTypeInfo) fieldTypeInfo).getListElementTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.ARRAY, subSchema, null); } else if (Category.MAP == typeCategory) { HCatFieldSchema.Type mapKeyType = getPrimitiveHType(((MapTypeInfo) fieldTypeInfo).getMapKeyTypeInfo()); HCatSchema subSchema = getHCatSchema(((MapTypeInfo) fieldTypeInfo).getMapValueTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.MAP, mapKeyType, subSchema, null); } else { throw new TypeNotPresentException(fieldTypeInfo.getTypeName(), null); } return hCatFieldSchema; } {code} Additional parameter FieldSchema in getHCatFieldSchema(String, TypeInfo) and replacing null with fs.getComment() should fix it. This bug impacts on HCatalogs Java API. was: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT,
[jira] [Updated] (HIVE-5724) Java API: comments are always null
[ https://issues.apache.org/jira/browse/HIVE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kassem Tohme updated HIVE-5724: --- Description: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema, null); } else if (Category.LIST == typeCategory) { HCatSchema subSchema = getHCatSchema(((ListTypeInfo) fieldTypeInfo).getListElementTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.ARRAY, subSchema, null); } else if (Category.MAP == typeCategory) { HCatFieldSchema.Type mapKeyType = getPrimitiveHType(((MapTypeInfo) fieldTypeInfo).getMapKeyTypeInfo()); HCatSchema subSchema = getHCatSchema(((MapTypeInfo) fieldTypeInfo).getMapValueTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.MAP, mapKeyType, subSchema, null); } else { throw new TypeNotPresentException(fieldTypeInfo.getTypeName(), null); } return hCatFieldSchema; } {code} Additional parameter FieldSchema in getHCatFieldSchema(String, TypeInfo) and replacing null with fs.getComment() should fix it. This bug also impacts on HCatalogs Java API. was: Hey I'm new to jira (just to let u know) Following scenario: Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema,
[jira] [Updated] (HIVE-5724) Java API: comments are always null
[ https://issues.apache.org/jira/browse/HIVE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kassem Tohme updated HIVE-5724: --- Description: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems to ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema, null); } else if (Category.LIST == typeCategory) { HCatSchema subSchema = getHCatSchema(((ListTypeInfo) fieldTypeInfo).getListElementTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.ARRAY, subSchema, null); } else if (Category.MAP == typeCategory) { HCatFieldSchema.Type mapKeyType = getPrimitiveHType(((MapTypeInfo) fieldTypeInfo).getMapKeyTypeInfo()); HCatSchema subSchema = getHCatSchema(((MapTypeInfo) fieldTypeInfo).getMapValueTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.MAP, mapKeyType, subSchema, null); } else { throw new TypeNotPresentException(fieldTypeInfo.getTypeName(), null); } return hCatFieldSchema; } {code} Additional parameter FieldSchema in getHCatFieldSchema(String, TypeInfo) and replacing null with fs.getComment() should fix it. This bug impacts on HCatalogs Java API. was: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName,
[jira] [Updated] (HIVE-5724) Java API: comments are always null
[ https://issues.apache.org/jira/browse/HIVE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kassem Tohme updated HIVE-5724: --- Description: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems to ignores the comment field. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.STRUCT, subSchema, null); } else if (Category.LIST == typeCategory) { HCatSchema subSchema = getHCatSchema(((ListTypeInfo) fieldTypeInfo).getListElementTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.ARRAY, subSchema, null); } else if (Category.MAP == typeCategory) { HCatFieldSchema.Type mapKeyType = getPrimitiveHType(((MapTypeInfo) fieldTypeInfo).getMapKeyTypeInfo()); HCatSchema subSchema = getHCatSchema(((MapTypeInfo) fieldTypeInfo).getMapValueTypeInfo()); hCatFieldSchema = new HCatFieldSchema(fieldName, HCatFieldSchema.Type.MAP, mapKeyType, subSchema, null); } else { throw new TypeNotPresentException(fieldTypeInfo.getTypeName(), null); } return hCatFieldSchema; } {code} Additional parameter FieldSchema in getHCatFieldSchema(String, TypeInfo) and replacing null with fs.getComment() should fix it. This bug impacts on HCatalogs Java API. was: Hey I'm new to jira (just to let u know) Following scenario: {code} Configuration config = new Configuration(); config.addResource(new Path(/etc/hive/conf.cloudera.hive1/hive-site.xml)); try { HiveConf hiveConfig = HCatUtil.getHiveConf(config); HiveMetaStoreClient hmsClient = HCatUtil.getHiveClient(hiveConfig); Table table = hmsClient.getTable(testdb, testtable); for (FieldSchema colFS : table.getSd().getCols()) { HCatFieldSchema col = HCatSchemaUtils.getHCatFieldSchema(colFS); System.out.println(col.getName() + + col.getComment()); } } catch (Exception e) { e.printStackTrace(); } {code} Output : id null value null HCatSchemaUtils.getHCatFieldSchema(String, TypInfo) seems to ignores it. {code} public static HCatFieldSchema getHCatFieldSchema(FieldSchema fs) throws HCatException { String fieldName = fs.getName(); TypeInfo baseTypeInfo = TypeInfoUtils.getTypeInfoFromTypeString(fs.getType()); return getHCatFieldSchema(fieldName, baseTypeInfo); } private static HCatFieldSchema getHCatFieldSchema(String fieldName, TypeInfo fieldTypeInfo) throws HCatException { Category typeCategory = fieldTypeInfo.getCategory(); HCatFieldSchema hCatFieldSchema; if (Category.PRIMITIVE == typeCategory) { hCatFieldSchema = new HCatFieldSchema(fieldName, getPrimitiveHType(fieldTypeInfo), null); } else if (Category.STRUCT == typeCategory) { HCatSchema subSchema = constructHCatSchema((StructTypeInfo) fieldTypeInfo); hCatFieldSchema = new HCatFieldSchema(fieldName,
[jira] [Commented] (HIVE-5707) Validate values for ConfVar
[ https://issues.apache.org/jira/browse/HIVE-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811239#comment-13811239 ] Hive QA commented on HIVE-5707: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611565/D13821.2.patch {color:green}SUCCESS:{color} +1 4548 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/109/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/109/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Validate values for ConfVar --- Key: HIVE-5707 URL: https://issues.apache.org/jira/browse/HIVE-5707 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13821.1.patch, D13821.2.patch with set hive.conf.validation=true, hive validates new value can be changed to the type. But it does not check value itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3990) Provide input threshold for direct-fetcher (HIVE-2925)
[ https://issues.apache.org/jira/browse/HIVE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3990: -- Attachment: D8415.2.patch navis updated the revision HIVE-3990 [jira] Provide input threshold for direct-fetcher (HIVE-2925). Rebased to trunk Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8415 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8415?vs=27291id=42879#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ql/src/java/org/apache/hadoop/hive/ql/metadata/InputEstimator.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java ql/src/test/queries/clientpositive/nonmr_fetch_threshold.q ql/src/test/results/clientpositive/nonmr_fetch_threshold.q.out To: JIRA, navis Provide input threshold for direct-fetcher (HIVE-2925) -- Key: HIVE-3990 URL: https://issues.apache.org/jira/browse/HIVE-3990 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D8415.2.patch, HIVE-3990.D8415.1.patch As a followup of HIVE-2925, add input threshold for fetch task conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3990) Provide input threshold for direct-fetcher (HIVE-2925)
[ https://issues.apache.org/jira/browse/HIVE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3990: Status: Patch Available (was: Open) Provide input threshold for direct-fetcher (HIVE-2925) -- Key: HIVE-3990 URL: https://issues.apache.org/jira/browse/HIVE-3990 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D8415.2.patch, HIVE-3990.D8415.1.patch As a followup of HIVE-2925, add input threshold for fetch task conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811265#comment-13811265 ] Hive QA commented on HIVE-5718: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611567/D13857.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 4547 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_inline org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/111/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/111/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4523: -- Attachment: HIVE-4523.7.patch Patch #7 changed back to original double rounding mechanism. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3959: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Two test failures were .q.out updates. Thanks, Thejas for review! Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.13.0 Attachments: HIVE-3959.1.patch, HIVE-3959.2.patch, HIVE-3959.3.patch, HIVE-3959.3.patch, HIVE-3959.4.patch, HIVE-3959.4.patch, HIVE-3959.5.patch, HIVE-3959.6.patch, HIVE-3959.patch.1, HIVE-3959.patch.11.txt, HIVE-3959.patch.12.txt, HIVE-3959.patch.2 When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811288#comment-13811288 ] Brock Noland commented on HIVE-5611: Great! I will take a look at this today. A couple notes below. bq. In src assembly, the prefixes for the .java files are different. This is because in ant, they copied over the project name. The ant based src tarball hive released are bad and should be completely ignored. The source assembly should exactly match a checkout of the source tree. bq. The java-docs are not included in src assembly, as they are not generated. I don't think javadocs should be included in the src assembly anyway, as I said above it should match a checkout of the source tree exactly. Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3990) Provide input threshold for direct-fetcher (HIVE-2925)
[ https://issues.apache.org/jira/browse/HIVE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811305#comment-13811305 ] Hive QA commented on HIVE-3990: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611585/D8415.2.patch {color:green}SUCCESS:{color} +1 4548 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/112/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/112/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Provide input threshold for direct-fetcher (HIVE-2925) -- Key: HIVE-3990 URL: https://issues.apache.org/jira/browse/HIVE-3990 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D8415.2.patch, HIVE-3990.D8415.1.patch As a followup of HIVE-2925, add input threshold for fetch task conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5722) Skip generating vectorization code if possible
[ https://issues.apache.org/jira/browse/HIVE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811313#comment-13811313 ] Brock Noland commented on HIVE-5722: +1, lgtm Skip generating vectorization code if possible -- Key: HIVE-5722 URL: https://issues.apache.org/jira/browse/HIVE-5722 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-5722.1.patch.txt NO PRECOMMIT TESTS Currently, ql module always generates new vectorization code, which might not be changed so frequently. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Bug in map join optimization causing OutOfMemory error
Hi, Thank you for the report! Can you open a JIRA for this issue? It sounds like a bug. Brock On Fri, Nov 1, 2013 at 2:23 AM, Mehant Baid baid.meh...@gmail.com wrote: Hey Folks, Could you please take a look at the below problem. We are hitting OutOfMemoryErrors while joining tables that are not managed by Hive. Would appreciate any feedback. Thanks Mehant On 10/7/13 12:04 PM, Mehant Baid wrote: Hey Folks, We are using hive-0.11 and are hitting java.lang.OutOfMemoryError. The problem seems to be in CommonJoinResolver.java (processCurrentTask()), in this function we try and convert a map-reduce join to a map join if 'n-1' tables involved in a 'n' way join have a size below a certain threshold. If the tables are maintained by hive then we have accurate sizes of each table and can apply this optimization but if the tables are created using storage handlers, HBaseStorageHanlder in our case then the size is set to be zero. Due to this we assume that we can apply the optimization and convert the map-reduce join to a map join. So we build a in-memory hash table for all the keys, since our table created using the storage handler is large, it does not fit in memory and we hit the error. Should I open a JIRA for this? One way to fix this is to set the size of the table (created using storage handler) to be equal to the map join threshold. This way the table would be selected as the big table and we can proceed with the optimization if other tables in the join have size below the threshold. If we have multiple big tables then the optimization would be turned off. Thanks Mehant -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-5721) Incremental build is disabled by MCOMPILER-209
[ https://issues.apache.org/jira/browse/HIVE-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811320#comment-13811320 ] Brock Noland commented on HIVE-5721: Nice find! I was wondering about this myself. +1 For future pom changes I think we should run precommit tests, just to verify it works in the test environment. Incremental build is disabled by MCOMPILER-209 -- Key: HIVE-5721 URL: https://issues.apache.org/jira/browse/HIVE-5721 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Attachments: HIVE-5721.1.patch.txt NO PRECOMMIT TESTS maven-compiler-plugin-3.1 has bug on incremental build(http://jira.codehaus.org/browse/MCOMPILER-209) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5707) Validate values for ConfVar
[ https://issues.apache.org/jira/browse/HIVE-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811336#comment-13811336 ] Brock Noland commented on HIVE-5707: +1 Validate values for ConfVar --- Key: HIVE-5707 URL: https://issues.apache.org/jira/browse/HIVE-5707 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13821.1.patch, D13821.2.patch with set hive.conf.validation=true, hive validates new value can be changed to the type. But it does not check value itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5708) PTest2 should trim long logs when posting to jira
[ https://issues.apache.org/jira/browse/HIVE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5708: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk, thanks!! PTest2 should trim long logs when posting to jira - Key: HIVE-5708 URL: https://issues.apache.org/jira/browse/HIVE-5708 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5708.patch When a build fails we post the build log to JIRA. The issue is that sometimes this can be a couple hundred KB. Since this log is available in the link also mentioned in the JIRA we should trim the message size down to say *last* 200 lines and then add a warning when we need to. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5723) Add dependency plugin
[ https://issues.apache.org/jira/browse/HIVE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811366#comment-13811366 ] Brock Noland commented on HIVE-5723: Thank you for doing this! Two questions below: 1) Don't we have to include transitive dependencies? For example if we depend on jar x and jar x depends on y then jar y should be required at runtime, no? 2) I haven't seen this plugin configured like this so please bear with me... I thought we wanted to use the copy-dependencies goal as the top example: http://maven.apache.org/plugins/maven-dependency-plugin/examples/copying-project-dependencies.html maybe not? 3) Should we add the lib directory to the clean goal? https://github.com/apache/hive/blob/trunk/pom.xml#L311 Add dependency plugin - Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt NO PRECOMMIT TESTS For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5723) Add dependency plugin
[ https://issues.apache.org/jira/browse/HIVE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811367#comment-13811367 ] Brock Noland commented on HIVE-5723: bq. Two questions below: I guess I should have said three :) Add dependency plugin - Key: HIVE-5723 URL: https://issues.apache.org/jira/browse/HIVE-5723 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-5723.1.patch.txt NO PRECOMMIT TESTS For easy gathering of required libraries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811371#comment-13811371 ] Hive QA commented on HIVE-4523: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611603/HIVE-4523.7.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4550 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_math_funcs {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/113/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/113/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments
[ https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811381#comment-13811381 ] Eric Hanson commented on HIVE-5581: --- Okay, thanks Teddy! Implement vectorized year/month/day... etc. for string arguments Key: HIVE-5581 URL: https://issues.apache.org/jira/browse/HIVE-5581 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5581.1.patch.txt Functions year(), month(), day(), weekofyear(), hour(), minute(), second() need to be implemented for string arguments in vectorized mode. They already work for timestamp arguments. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5695) PTest2 fix shutdown, duplicate runs, and add client retry
[ https://issues.apache.org/jira/browse/HIVE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811411#comment-13811411 ] Hive QA commented on HIVE-5695: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611093/HIVE-5695.patch {color:green}SUCCESS:{color} +1 4547 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/114/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/114/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. PTest2 fix shutdown, duplicate runs, and add client retry - Key: HIVE-5695 URL: https://issues.apache.org/jira/browse/HIVE-5695 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5695.patch There are three issues with the PTest2 framework at present: 1) When a test crashes it doesn't shutdown the underlying executors quickly 2) If a jira has two patches uploaded say 20 minutes a part that jira will have two pre-commit runs executed 3) The client doesn't aggressively retry impotent operations leading to the client (i.e. jenkins) finishing before the test actually does -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5695) PTest2 fix shutdown, duplicate runs, and add client retry
[ https://issues.apache.org/jira/browse/HIVE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5695: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk! Thanks! PTest2 fix shutdown, duplicate runs, and add client retry - Key: HIVE-5695 URL: https://issues.apache.org/jira/browse/HIVE-5695 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5695.patch There are three issues with the PTest2 framework at present: 1) When a test crashes it doesn't shutdown the underlying executors quickly 2) If a jira has two patches uploaded say 20 minutes a part that jira will have two pre-commit runs executed 3) The client doesn't aggressively retry impotent operations leading to the client (i.e. jenkins) finishing before the test actually does -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4523: -- Attachment: HIVE-4523.8.patch Patch #8 fixed the failed test case. Hoping it's all good by now. round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, HIVE-4523.8.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5725) Separate out ql code from exec jar
Owen O'Malley created HIVE-5725: --- Summary: Separate out ql code from exec jar Key: HIVE-5725 URL: https://issues.apache.org/jira/browse/HIVE-5725 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Owen O'Malley Assignee: Owen O'Malley We should publish our code independently from our dependencies. Since the exec jar has to include the runtime dependencies, I'd propose that we make two jars, a ql jar and and exec jar. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811441#comment-13811441 ] Brock Noland commented on HIVE-4388: This will need to be rebased due to mavenization... also I noticed with 0.96 we'll need to modify a test: {noformat} TestHBaseNegativeCliDriver.cascade_dbdrop Change dfs -ls ../build/ql/tmp/hbase/hbase_table_0; to dfs -ls ../build/ql/tmp/hbase/data/default/hbase_table_0; {noformat} HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5726) The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant
[ https://issues.apache.org/jira/browse/HIVE-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5726: -- Description: Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3, 2). was: Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3.2). The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant --- Key: HIVE-5726 URL: https://issues.apache.org/jira/browse/HIVE-5726 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3, 2). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5726) The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant
Xuefu Zhang created HIVE-5726: - Summary: The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant Key: HIVE-5726 URL: https://issues.apache.org/jira/browse/HIVE-5726 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3.2). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5635) WebHCatJTShim23 ignores security/user context
[ https://issues.apache.org/jira/browse/HIVE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5635: - Status: Patch Available (was: Open) WebHCatJTShim23 ignores security/user context - Key: HIVE-5635 URL: https://issues.apache.org/jira/browse/HIVE-5635 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5635.patch WebHCatJTShim23 takes UserGroupInformation object as argument (which represents the user make the call to WebHCat or doAs user) but ignores. WebHCatJTShim20S uses the UserGroupInformation This is inconsistent and may be a security hole because in with Hadoop 2 the methods on WebHCatJTShim are likely running with 'hcat' as the user context. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811472#comment-13811472 ] Szehon Ho commented on HIVE-5611: - Sure, I'll see if I can dumb down the src assemly descriptor to copy over the source tree, I guess those are non-issues then. Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811478#comment-13811478 ] Brock Noland commented on HIVE-5611: [~thejas] The ant binary tarball had an hcatalog directory and all hcatalog stuff was located there. I believe part of the reason was that with the ant build the hive and hcatalog builds were somewhat separate. The two builds are very much integrated at this point, therefore I'd like to eliminate this hcatalog directory and include the stuff that was under bin/ under the hive bin/ etc. Do you see an issue with this or are you opposed to this? Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5584) Write initial user documentation for vectorized query on Hive Wiki
[ https://issues.apache.org/jira/browse/HIVE-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811495#comment-13811495 ] Eric Hanson commented on HIVE-5584: --- Completed: https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution Please comment on the document or this JIRA with any feedback. Write initial user documentation for vectorized query on Hive Wiki -- Key: HIVE-5584 URL: https://issues.apache.org/jira/browse/HIVE-5584 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Write a guide about how to use vectorization in Hive, including a description of its benefits and where it is applicable. See Lefty's comment on HIVE-4160 about where to put it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5584) Write initial user documentation for vectorized query on Hive Wiki
[ https://issues.apache.org/jira/browse/HIVE-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson resolved HIVE-5584. --- Resolution: Fixed Write initial user documentation for vectorized query on Hive Wiki -- Key: HIVE-5584 URL: https://issues.apache.org/jira/browse/HIVE-5584 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Write a guide about how to use vectorization in Hive, including a description of its benefits and where it is applicable. See Lefty's comment on HIVE-4160 about where to put it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4160) Vectorized Query Execution in Hive
[ https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811498#comment-13811498 ] Eric Hanson commented on HIVE-4160: --- I put initial documentation at: https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution Vectorized Query Execution in Hive -- Key: HIVE-4160 URL: https://issues.apache.org/jira/browse/HIVE-4160 Project: Hive Issue Type: New Feature Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: Hive-Vectorized-Query-Execution-Design-rev10.docx, Hive-Vectorized-Query-Execution-Design-rev10.docx, Hive-Vectorized-Query-Execution-Design-rev10.pdf, Hive-Vectorized-Query-Execution-Design-rev11.docx, Hive-Vectorized-Query-Execution-Design-rev11.pdf, Hive-Vectorized-Query-Execution-Design-rev2.docx, Hive-Vectorized-Query-Execution-Design-rev3.docx, Hive-Vectorized-Query-Execution-Design-rev3.docx, Hive-Vectorized-Query-Execution-Design-rev3.pdf, Hive-Vectorized-Query-Execution-Design-rev4.docx, Hive-Vectorized-Query-Execution-Design-rev4.pdf, Hive-Vectorized-Query-Execution-Design-rev5.docx, Hive-Vectorized-Query-Execution-Design-rev5.pdf, Hive-Vectorized-Query-Execution-Design-rev6.docx, Hive-Vectorized-Query-Execution-Design-rev6.pdf, Hive-Vectorized-Query-Execution-Design-rev7.docx, Hive-Vectorized-Query-Execution-Design-rev8.docx, Hive-Vectorized-Query-Execution-Design-rev8.pdf, Hive-Vectorized-Query-Execution-Design-rev9.docx, Hive-Vectorized-Query-Execution-Design-rev9.pdf, Hive-Vectorized-Query-Execution-Design.docx The Hive query execution engine currently processes one row at a time. A single row of data goes through all the operators before the next row can be processed. This mode of processing is very inefficient in terms of CPU usage. Research has demonstrated that this yields very low instructions per cycle [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization and data columns go through a layer of object inspectors that identify column type, deserialize data and determine appropriate expression routines in the inner loop. These layers of virtual method calls further slow down the processing. This work will add support for vectorized query execution to Hive, where, instead of individual rows, batches of about a thousand rows at a time are processed. Each column in the batch is represented as a vector of a primitive data type. The inner loop of execution scans these vectors very fast, avoiding method calls, deserialization, unnecessary if-then-else, etc. This substantially reduces CPU time used, and gives excellent instructions per cycle (i.e. improved processor pipeline utilization). See the attached design specification for more details. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5705) TopN might use better heuristic for disable
[ https://issues.apache.org/jira/browse/HIVE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811501#comment-13811501 ] Sergey Shelukhin commented on HIVE-5705: [~hagleitn] fyi this is the jira we were talking about yday TopN might use better heuristic for disable --- Key: HIVE-5705 URL: https://issues.apache.org/jira/browse/HIVE-5705 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Priority: Minor Right now, if TopN overruns memory threshold it disables itself if it couldn't directly exclude rows as they are sent; it doesn't count evictions that were initially put in the heap and then superceded for this purpose. It's reasonable in most cases, but if N is relatively small, and map output is large, the cost could still be worth it even if rows don't get excluded immediately and are only evicted after being stored for some time. So we'd pay some memory copies but emit much less rows. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5191) Add char data type
[ https://issues.apache.org/jira/browse/HIVE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5191: - Attachment: HIVE-5191.4.patch patch v4: Xuefu feedback, update qfiles due to mavenization Add char data type -- Key: HIVE-5191 URL: https://issues.apache.org/jira/browse/HIVE-5191 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5191.1.patch, HIVE-5191.2.patch, HIVE-5191.3.patch, HIVE-5191.4.patch Separate task for char type, since HIVE-4844 only adds varchar -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5727) Username for scratch directory ignores kerberos authentication
Gwen Shapira created HIVE-5727: -- Summary: Username for scratch directory ignores kerberos authentication Key: HIVE-5727 URL: https://issues.apache.org/jira/browse/HIVE-5727 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Gwen Shapira I'm logged in to Linux as user gwen.shapira. I use kinit app_etl to authenticate in Hadoop as app_etl user. The scratch directory hive will attempt to use is still /tmp/hive-gwen.shapira Note that in a properly configured system, app_etl user will not have write permissions in /tmp/hive-gwen.shapira and the query will fail. Correct behavior: When Kerberos authentication is used, the user name should be taken from the kerberos ticket. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15184: Store state of stats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15184/ --- Review request for hive. Bugs: HIVE-3777 https://issues.apache.org/jira/browse/HIVE-3777 Repository: hive Description --- Store state of stats. Diffs - trunk/common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 1537954 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 1537954 Diff: https://reviews.apache.org/r/15184/diff/ Testing --- Existing tests suffice. Thanks, Ashutosh Chauhan
[jira] [Updated] (HIVE-3777) add a property in the partition to figure out if stats are accurate
[ https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3777: --- Attachment: HIVE-3777.2.patch Review request up at: https://reviews.apache.org/r/15184/ .q.out needs to be updated, so some failures in Hive QA is expected. add a property in the partition to figure out if stats are accurate --- Key: HIVE-3777 URL: https://issues.apache.org/jira/browse/HIVE-3777 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Ashutosh Chauhan Attachments: HIVE-3777.2.patch, HIVE-3777.patch Currently, stats task tries to update the statistics in the table/partition being updated after the table/partition is loaded. In case of a failure to update these stats (due to the any reason), the operation either succeeds (writing inaccurate stats) or fails depending on whether hive.stats.reliable is set to true. This can be bad for applications who do not always care about reliable stats, since the query may have taken a long time to execute and then fail eventually. Another property should be added to the partition: areStatsAccurate. If hive.stats.reliable is set to false, and stats could not be computed correctly, the operation would still succeed, update the stats, but set areStatsAccurate to false. If the application cares about accurate stats, it can be obtained in the background. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3777) add a property in the partition to figure out if stats are accurate
[ https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3777: --- Affects Version/s: 0.13.0 Status: Patch Available (was: In Progress) add a property in the partition to figure out if stats are accurate --- Key: HIVE-3777 URL: https://issues.apache.org/jira/browse/HIVE-3777 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.13.0 Reporter: Namit Jain Assignee: Ashutosh Chauhan Attachments: HIVE-3777.2.patch, HIVE-3777.patch Currently, stats task tries to update the statistics in the table/partition being updated after the table/partition is loaded. In case of a failure to update these stats (due to the any reason), the operation either succeeds (writing inaccurate stats) or fails depending on whether hive.stats.reliable is set to true. This can be bad for applications who do not always care about reliable stats, since the query may have taken a long time to execute and then fail eventually. Another property should be added to the partition: areStatsAccurate. If hive.stats.reliable is set to false, and stats could not be computed correctly, the operation would still succeed, update the stats, but set areStatsAccurate to false. If the application cares about accurate stats, it can be obtained in the background. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5503) TopN optimization in VectorReduceSink
[ https://issues.apache.org/jira/browse/HIVE-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5503: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Sergey! TopN optimization in VectorReduceSink - Key: HIVE-5503 URL: https://issues.apache.org/jira/browse/HIVE-5503 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Sergey Shelukhin Fix For: 0.13.0 Attachments: HIVE-5503.02.patch, HIVE-5503.03.patch, HIVE-5503.patch We need to add TopN optimization to VectorReduceSink as well, it would be great if ReduceSink and VectorReduceSink share this code. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15184: Store state of stats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15184/ --- (Updated Nov. 1, 2013, 6:23 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-3777 https://issues.apache.org/jira/browse/HIVE-3777 Repository: hive Description --- Store state of stats. Diffs - trunk/common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 1537954 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1537954 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 1537954 Diff: https://reviews.apache.org/r/15184/diff/ Testing --- Existing tests suffice. Thanks, Ashutosh Chauhan
[jira] [Created] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive
Daniel Dai created HIVE-5728: Summary: Make ORC InputFormat/OutputFormat available outside Hive Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Attachment: HIVE-5728-1.patch Attach HIVE-5728-1.patch. Summary of changes: 1. Create OrcNewInputFormat/OrcNewOrcputFormat in new api 2. Extract common pieces of OrcNewInputFormat/OrcInputFormat into OrcInputFormatUtils 3. Make several classes/methods public 4. Make WriteOptions configurable through Configuration 5. Add unit tests for newly added InputFormat/OutputFormat Make ORC InputFormat/OutputFormat available outside Hive Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Summary: Make ORC InputFormat/OutputFormat usable outside Hive (was: Make ORC InputFormat/OutputFormat available outside Hive) Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811551#comment-13811551 ] Brock Noland commented on HIVE-5728: {noformat} )!=null) { {noformat} Nice work!! There should be spaces on both sides of the != sign. Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5729) Beeline displays version as ???? after mavenization
Szehon Ho created HIVE-5729: --- Summary: Beeline displays version as after mavenization Key: HIVE-5729 URL: https://issues.apache.org/jira/browse/HIVE-5729 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Szehon Ho In Beeline.java, method getApplicationTitle(), it looks to the Beeline class's package to find version information. However, MANIFESTs are not included in Beeline jar after mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5730) Beeline throws non-terminal NPE upon starting, after mavenization
Szehon Ho created HIVE-5730: --- Summary: Beeline throws non-terminal NPE upon starting, after mavenization Key: HIVE-5730 URL: https://issues.apache.org/jira/browse/HIVE-5730 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Szehon Ho In beeline project's SQLCompletor.java, it uses the local class in an attempt to load sql-keywords.properties. Java class behavior will prepend this file name with the class's package name. This results in a NPE during Beeline initialization. Before mavenization, the sql-keywords.properties lived in the same package as SQLCompletor.java, but now it is moved to src/main/resources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-5611: Attachment: HIVE-5611.1.patch Fixed up the mvn src assembly descriptor to copy the source directory structure. Did not end up using moduleSet because the directory structure those put in is different than the source tree (maven gives option to either prepends the project's module-name, or prepends nothing) Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.1.patch, HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions
[ https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5557: Status: Patch Available (was: Open) Push down qualifying Where clause predicates as join conditions --- Key: HIVE-5557 URL: https://issues.apache.org/jira/browse/HIVE-5557 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch See details in HIVE- -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions
[ https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5557: Attachment: HIVE-5557.2.patch Push down qualifying Where clause predicates as join conditions --- Key: HIVE-5557 URL: https://issues.apache.org/jira/browse/HIVE-5557 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch See details in HIVE- -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811595#comment-13811595 ] Brock Noland commented on HIVE-5611: Can you put up a RB item for this patch? https://reviews.apache.org Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.1.patch, HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811604#comment-13811604 ] Prasanth J commented on HIVE-5632: -- [~ehans] Sorry about my comment about in-memory skips above. For every 10,000 rows or configured number of rows (orc.row.index.stride) ORC creates disk ranges (byte ranges) that are required to be read. Only the disk ranges that satisfies min/max conditions will be read. Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811605#comment-13811605 ] Prasanth J commented on HIVE-5632: -- Checked with Owen, my comment about in-memory skips was wrong. Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5726) The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant
[ https://issues.apache.org/jira/browse/HIVE-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5726: -- Attachment: HIVE-5726.patch The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant --- Key: HIVE-5726 URL: https://issues.apache.org/jira/browse/HIVE-5726 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5726.patch Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3, 2). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5726) The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant
[ https://issues.apache.org/jira/browse/HIVE-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5726: -- Status: Patch Available (was: Open) The DecimalTypeInfo instance associated with a decimal constant is not in line with the precision/scale of the constant --- Key: HIVE-5726 URL: https://issues.apache.org/jira/browse/HIVE-5726 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5726.patch Currently Hive uses a default decimal type info instance to associate with a decimal constant in the expression tree. To precisely determine the precision/scale of the expression result requires more accurate precision/scale of the type of the decimal constant. Thus, Hive uses a precision/scale of the constant for the type info instance. As an example, the following is not desirable: {code} hive create table mytable as select 3.14BD as t from person_age limit 1; hive desc mytable; OK t decimal(65,30) None Time taken: 0.08 seconds, Fetched: 1 row(s) {code} instead, the precision/scale for t above should be (3, 2). -- This message was sent by Atlassian JIRA (v6.1#6144)
RE: getting testreport with maven build, and getting lib folder to run end-to-end tests
My workaround for looking at the output of one test is to eyeball the TEST*.xml file under ql\target\surefire-reports For people who are just changing code in ql, you can copy ql\target\hive-exec-0.13.0-SNAPSHOT.jar over the top of the old jar in the lib directory in your Hive installation, and you should be able to run end-to-end ad hoc tests from the Hive CLI. Eric -Original Message- From: Brock Noland [mailto:br...@cloudera.com] Sent: Thursday, October 31, 2013 6:21 PM To: dev@hive.apache.org Subject: Re: getting testreport with maven build, and getting lib folder to run end-to-end tests It looks like we could use this plugin http://maven.apache.org/surefire/maven-surefire-report-plugin/ to generate a report. I just created https://issues.apache.org/jira/browse/HIVE-5720 for that. The later question depends on https://issues.apache.org/jira/browse/HIVE-5611. On Thu, Oct 31, 2013 at 8:03 PM, Eric Hanson (SQL SERVER) eric.n.han...@microsoft.com wrote: In the new Hive maven build, after you run a test case, how do you do the equivalent of ant testreport and view the test output html file? Also, I used to copy build\dist\lib to the lib folder of my one-box hive installation and run it to do end-to-end ad hoc testing of the newly compiled hive bits. Now the build\dist\lib folder is not there. How do I do the equivalent? Can somebody put the answers into https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ too? Thanks, Eric -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Updated] (HIVE-5715) HS2 should not start a session for every command
[ https://issues.apache.org/jira/browse/HIVE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5715: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks Gunther! HS2 should not start a session for every command -- Key: HIVE-5715 URL: https://issues.apache.org/jira/browse/HIVE-5715 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.13.0 Attachments: HIVE-5715.1.patch, HIVE-5715.2.patch HS2 calls SessionState.start multiple times (acquire, operation.run) - where it really just cares that the session is set in the thread local store. There are some calls in start session method now that preload stuff that's used during the session. Instead of doing that over and over again, I think it'd be nicer for HS2 to start a session once and then just do the thread local magic as needed. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15187: HIVE-5611 Add assembly (i.e.) tar creation to pom
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15187/ --- Review request for hive. Repository: hive-git Description --- Add src and bin descriptors to maven packaging project. Src.tar has the entire Hive source tree as is, following Apache src.tar format. Decided not to use moduleSet as maven only gives option to prepend the module name, which is different than the directory name in thise case. Bin.tar still does not include hcatalog stuff. It uses mvn assembly fileset to do what the ant package tasks of hive/build.xml used to do. It also uses maven's dependencySet to pull in dependency jars. But hive/hcatalog had a separate ant build, and there is more effort needed to include that into this mvn bin assembly in the correct directory structure. Diffs - packaging/pom.xml 973b351 packaging/src/main/assembly/bin.xml PRE-CREATION packaging/src/main/assembly/src.xml PRE-CREATION Diff: https://reviews.apache.org/r/15187/diff/ Testing --- Thanks, Szehon Ho
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811633#comment-13811633 ] Szehon Ho commented on HIVE-5611: - Done: https://reviews.apache.org/r/15187/ Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.1.patch, HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-5611: Status: Patch Available (was: Open) Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.1.patch, HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5611) Add assembly (i.e.) tar creation to pom
[ https://issues.apache.org/jira/browse/HIVE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811639#comment-13811639 ] Szehon Ho commented on HIVE-5611: - Also I have created HIVE-5729 and HIVE-5730 , to track the beeline bundling issues found during manually testing the bin. Add assembly (i.e.) tar creation to pom --- Key: HIVE-5611 URL: https://issues.apache.org/jira/browse/HIVE-5611 Project: Hive Issue Type: Sub-task Components: Build Infrastructure Reporter: Brock Noland Assignee: Szehon Ho Labels: Maven Attachments: HIVE-5611.1.patch, HIVE-5611.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15187: HIVE-5611 Add assembly (i.e.) tar creation to pom
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15187/#review28036 --- Great work!! This is a great start. I think we'll figure out the hcatalog stuff in a follow-on jira. I just have a few issues below mostly related to indenting. packaging/pom.xml https://reviews.apache.org/r/15187/#comment54531 Let's put this in a profile called dist. Here is an example of profiles: https://github.com/apache/hive/blob/trunk/odbc/pom.xml#L56 packaging/pom.xml https://reviews.apache.org/r/15187/#comment54529 Let's un-comment these for now packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54521 The key=value's should be indented packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54522 id, formats, baseDir are all indented incorrectly packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54520 If you extract the binary tar it has a weird second directory. Therefore I think this should just be ./ packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54524 this includes indenting is wrong packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54517 trailing ws packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54527 same indenting issues as above packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54525 trailing ws packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54526 indenting is off here packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54528 We cannot just exclude the temp stores as opposed to including src? - Brock Noland On Nov. 1, 2013, 8:38 p.m., Szehon Ho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15187/ --- (Updated Nov. 1, 2013, 8:38 p.m.) Review request for hive. Repository: hive-git Description --- Add src and bin descriptors to maven packaging project. Src.tar has the entire Hive source tree as is, following Apache src.tar format. Decided not to use moduleSet as maven only gives option to prepend the module name, which is different than the directory name in thise case. Bin.tar still does not include hcatalog stuff. It uses mvn assembly fileset to do what the ant package tasks of hive/build.xml used to do. It also uses maven's dependencySet to pull in dependency jars. But hive/hcatalog had a separate ant build, and there is more effort needed to include that into this mvn bin assembly in the correct directory structure. Diffs - packaging/pom.xml 973b351 packaging/src/main/assembly/bin.xml PRE-CREATION packaging/src/main/assembly/src.xml PRE-CREATION Diff: https://reviews.apache.org/r/15187/diff/ Testing --- Thanks, Szehon Ho
To rebuild protobuf...
With the new mavenization, to rebuild the protobuf, you need to: % mvn -Pprotobuf,hadoop-1 package I just figured I'd save anyone else from having to figure it out until we have the wiki updated. -- Owen
Re: To rebuild protobuf...
Thanks Owen. Note that has been on the wiki already for a couple days under How to generate protobuf code? https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ On Fri, Nov 1, 2013 at 3:57 PM, Owen O'Malley omal...@apache.org wrote: With the new mavenization, to rebuild the protobuf, you need to: % mvn -Pprotobuf,hadoop-1 package I just figured I'd save anyone else from having to figure it out until we have the wiki updated. -- Owen -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Updated] (HIVE-5547) webhcat pig job submission should ship hive tar if -usehcatalog is specified
[ https://issues.apache.org/jira/browse/HIVE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5547: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks for the contribution Eugene! webhcat pig job submission should ship hive tar if -usehcatalog is specified Key: HIVE-5547 URL: https://issues.apache.org/jira/browse/HIVE-5547 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5547.2.patch, HIVE-5547.3.patch, HIVE-5547.patch Currently when when a Pig job is submitted through WebHCat and the Pig script uses HCatalog, that means that Hive should be installed on the node in the cluster which ends up executing the job. For large clusters is this a manageability issue so we should use DistributedCache to ship the Hive tar file to the target node as part of job submission TestPig_11 in hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf has the test case for this -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql
[ https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811717#comment-13811717 ] Hive QA commented on HIVE-4523: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12611626/HIVE-4523.8.patch {color:green}SUCCESS:{color} +1 4548 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/115/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/115/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12611626 round() function with specified decimal places not consistent with mysql - Key: HIVE-4523 URL: https://issues.apache.org/jira/browse/HIVE-4523 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.7.1 Reporter: Fred Desing Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, HIVE-4523.8.patch, HIVE-4523.patch // hive hive select round(150.000, 2) from temp limit 1; 150.0 hive select round(150, 2) from temp limit 1; 150.0 // mysql mysql select round(150.000, 2) from DUAL limit 1; round(150.000, 2) 150.00 mysql select round(150, 2) from DUAL limit 1; round(150, 2) 150 http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5635) WebHCatJTShim23 ignores security/user context
[ https://issues.apache.org/jira/browse/HIVE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811731#comment-13811731 ] Hive QA commented on HIVE-5635: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12610389/HIVE-5635.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/116/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/116/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-116/source-prep.txt + [[ true == \t\r\u\e ]] + rm -rf ivy maven + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/test/results/clientpositive/decimal_udf.q.out' Reverted 'ql/src/test/results/clientpositive/udf_round.q.out' Reverted 'ql/src/test/results/compiler/plan/udf4.q.xml' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFRound.java' Reverted 'ql/src/test/queries/clientpositive/udf_round.q' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/RoundWithNumDigitsDoubleToDouble.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRound.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFRound.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/RoundUtils.java + svn update Uhcatalog/webhcat/svr/src/main/config/webhcat-default.xml U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LauncherDelegator.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/TempletonDelegator.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/AppConfig.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Server.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarDelegator.java U hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigDelegator.java Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1538078. Updated to revision 1538078. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh
Re: Review Request 15187: HIVE-5611 Add assembly (i.e.) tar creation to pom
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15187/#review28044 --- packaging/pom.xml https://reviews.apache.org/r/15187/#comment54552 Done, I added the profile to control this execution, inside this packaging pom. Let me know if thats not what you had in mind. I suppose we need to change the wiki for this. Before in ant, running ant package would result in build/ folder for users to run the executables. In mvn, user needs to enable -Pdist before they can get a /packaging/target dist for them to use. packaging/pom.xml https://reviews.apache.org/r/15187/#comment54567 I removed these. If I uncomment them, these project's dependencies get pulled into the bin. These cause a lot of classpath issues, like for example same jar but different version, and also pulling in xerces xml parsing that is incompatible with hive startup. packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54551 Done packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54565 Done packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54566 Done packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54553 Done, used includeBaseDirectory = false on both src and bin descriptors, to get rid of the second directory. packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54554 Done. packaging/src/main/assembly/bin.xml https://reviews.apache.org/r/15187/#comment54557 Done. packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54559 Done packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54562 Done packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54563 Done packaging/src/main/assembly/src.xml https://reviews.apache.org/r/15187/#comment54564 Changed. - Szehon Ho On Nov. 1, 2013, 8:38 p.m., Szehon Ho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15187/ --- (Updated Nov. 1, 2013, 8:38 p.m.) Review request for hive. Repository: hive-git Description --- Add src and bin descriptors to maven packaging project. Src.tar has the entire Hive source tree as is, following Apache src.tar format. Decided not to use moduleSet as maven only gives option to prepend the module name, which is different than the directory name in thise case. Bin.tar still does not include hcatalog stuff. It uses mvn assembly fileset to do what the ant package tasks of hive/build.xml used to do. It also uses maven's dependencySet to pull in dependency jars. But hive/hcatalog had a separate ant build, and there is more effort needed to include that into this mvn bin assembly in the correct directory structure. Diffs - packaging/pom.xml 973b351 packaging/src/main/assembly/bin.xml PRE-CREATION packaging/src/main/assembly/src.xml PRE-CREATION Diff: https://reviews.apache.org/r/15187/diff/ Testing --- Thanks, Szehon Ho