[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817106#comment-13817106 ] Hive QA commented on HIVE-5700: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612776/HIVE-5700.01.patch {color:green}SUCCESS:{color} +1 4594 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/205/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/205/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612776 enforce single date format for partition column storage --- Key: HIVE-5700 URL: https://issues.apache.org/jira/browse/HIVE-5700 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5700.01.patch, HIVE-5700.patch inspired by HIVE-5286. Partition column for dates should be stored as either integer, or as fixed representation e.g. -mm-dd. External representation can remain varied as is. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments
[ https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817130#comment-13817130 ] Hive QA commented on HIVE-5581: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612780/HIVE-5581.3.patch {color:green}SUCCESS:{color} +1 4603 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/206/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/206/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612780 Implement vectorized year/month/day... etc. for string arguments Key: HIVE-5581 URL: https://issues.apache.org/jira/browse/HIVE-5581 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5581.1.patch.txt, HIVE-5581.2.patch, HIVE-5581.3.patch Functions year(), month(), day(), weekofyear(), hour(), minute(), second() need to be implemented for string arguments in vectorized mode. They already work for timestamp arguments. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3777) add a property in the partition to figure out if stats are accurate
[ https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817188#comment-13817188 ] Hive QA commented on HIVE-3777: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612797/HIVE-3777.5.patch {color:green}SUCCESS:{color} +1 4595 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/208/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/208/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612797 add a property in the partition to figure out if stats are accurate --- Key: HIVE-3777 URL: https://issues.apache.org/jira/browse/HIVE-3777 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.13.0 Reporter: Namit Jain Assignee: Ashutosh Chauhan Attachments: HIVE-3777.2.patch, HIVE-3777.2.patch, HIVE-3777.3.patch, HIVE-3777.4.patch, HIVE-3777.5.patch, HIVE-3777.patch Currently, stats task tries to update the statistics in the table/partition being updated after the table/partition is loaded. In case of a failure to update these stats (due to the any reason), the operation either succeeds (writing inaccurate stats) or fails depending on whether hive.stats.reliable is set to true. This can be bad for applications who do not always care about reliable stats, since the query may have taken a long time to execute and then fail eventually. Another property should be added to the partition: areStatsAccurate. If hive.stats.reliable is set to false, and stats could not be computed correctly, the operation would still succeed, update the stats, but set areStatsAccurate to false. If the application cares about accurate stats, it can be obtained in the background. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38
[ https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5565: -- Fix Version/s: (was: 0.13.0) Status: Open (was: Patch Available) Limit Hive decimal type maximum precision and scale to 38 - Key: HIVE-5565 URL: https://issues.apache.org/jira/browse/HIVE-5565 Project: Hive Issue Type: Task Components: Types Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5565.patch With HIVE-3976, the maximum precision is set to 65, and maximum scale is to 30. After discussing with several folks in the community, it's determined that 38 as a maximum for both precision and scale are probably sufficient, in addition to the potential performance boost that might become possible to some implementation. This task is to make such a change. The change is expected to be trivial, but it may impact many test cases. The reason for a separate JIRA is that patch in HIVE-3976 is already in a good shape. Rather than destabilizing a bigger patch, a dedicate patch will facilitates both reviews. The wiki document will be updated shortly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error
[ https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817337#comment-13817337 ] Xuefu Zhang commented on HIVE-3819: --- [~mgrover] were you able to reproduce? If not, I think we can close this one. Thanks. Creating a table on Hive without Hadoop daemons running returns a misleading error -- Key: HIVE-3819 URL: https://issues.apache.org/jira/browse/HIVE-3819 Project: Hive Issue Type: Bug Components: CLI, Metastore Reporter: Mark Grover Assignee: Xuefu Zhang I was running hive without running the underlying hadoop daemon's running. Hadoop was configured to run in pseudo-distributed mode. However, when I tried to create a hive table, I got this rather misleading error: {code} FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} We should look into making this error message less misleading (more about hadoop daemons not running instead of metastore client not being instantiable). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error
[ https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817348#comment-13817348 ] Mark Grover commented on HIVE-3819: --- Sorry, Xuefu, I haven't had the time. If you can't reproduce this, please go ahead and mark this as Cant' reproduce. Thanks for checking! Creating a table on Hive without Hadoop daemons running returns a misleading error -- Key: HIVE-3819 URL: https://issues.apache.org/jira/browse/HIVE-3819 Project: Hive Issue Type: Bug Components: CLI, Metastore Reporter: Mark Grover Assignee: Xuefu Zhang I was running hive without running the underlying hadoop daemon's running. Hadoop was configured to run in pseudo-distributed mode. However, when I tried to create a hive table, I got this rather misleading error: {code} FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} We should look into making this error message less misleading (more about hadoop daemons not running instead of metastore client not being instantiable). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3777) add a property in the partition to figure out if stats are accurate
[ https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3777: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Thejas for review! add a property in the partition to figure out if stats are accurate --- Key: HIVE-3777 URL: https://issues.apache.org/jira/browse/HIVE-3777 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.13.0 Reporter: Namit Jain Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-3777.2.patch, HIVE-3777.2.patch, HIVE-3777.3.patch, HIVE-3777.4.patch, HIVE-3777.5.patch, HIVE-3777.patch Currently, stats task tries to update the statistics in the table/partition being updated after the table/partition is loaded. In case of a failure to update these stats (due to the any reason), the operation either succeeds (writing inaccurate stats) or fails depending on whether hive.stats.reliable is set to true. This can be bad for applications who do not always care about reliable stats, since the query may have taken a long time to execute and then fail eventually. Another property should be added to the partition: areStatsAccurate. If hive.stats.reliable is set to false, and stats could not be computed correctly, the operation would still succeed, update the stats, but set areStatsAccurate to false. If the application cares about accurate stats, it can be obtained in the background. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5642) Exception in UDFs with large number of arguments.
[ https://issues.apache.org/jira/browse/HIVE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-5642. Resolution: Fixed Fix Version/s: 0.13.0 Fixed via HIVE-5604 Exception in UDFs with large number of arguments. - Key: HIVE-5642 URL: https://issues.apache.org/jira/browse/HIVE-5642 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Such UDFs will mostly be custom UDFs, but if they are not supported in vector more, we should fall back to non-vector mode. {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor$Builder.setArgumentType(VectorExpressionDescriptor.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:545) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:460) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.
[ https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5779: --- Status: Open (was: Patch Available) Subquery in where clause with distinct fails with mapjoin turned on with serialization error. - Key: HIVE-5779 URL: https://issues.apache.org/jira/browse/HIVE-5779 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-5779.2.patch, HIVE-5779.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.
[ https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5779: --- Status: Patch Available (was: Open) Subquery in where clause with distinct fails with mapjoin turned on with serialization error. - Key: HIVE-5779 URL: https://issues.apache.org/jira/browse/HIVE-5779 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-5779.2.patch, HIVE-5779.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.
[ https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5779: --- Attachment: HIVE-5779.2.patch Re-upload for Hive QA to pick up. Subquery in where clause with distinct fails with mapjoin turned on with serialization error. - Key: HIVE-5779 URL: https://issues.apache.org/jira/browse/HIVE-5779 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-5779.2.patch, HIVE-5779.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.
[ https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5691: --- Status: Open (was: Patch Available) Intermediate columns are incorrectly initialized for partitioned tables. Key: HIVE-5691 URL: https://issues.apache.org/jira/browse/HIVE-5691 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch Intermediate columns are incorrectly initialized for partitioned tables. Same tablescan operator can be used for multiple partitions. The vectorizer doesn't initialize for all partition paths. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.
[ https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5691: --- Attachment: HIVE-5691.4.patch Another attempt for Hive QA. Intermediate columns are incorrectly initialized for partitioned tables. Key: HIVE-5691 URL: https://issues.apache.org/jira/browse/HIVE-5691 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch, HIVE-5691.4.patch Intermediate columns are incorrectly initialized for partitioned tables. Same tablescan operator can be used for multiple partitions. The vectorizer doesn't initialize for all partition paths. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5691) Intermediate columns are incorrectly initialized for partitioned tables.
[ https://issues.apache.org/jira/browse/HIVE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5691: --- Status: Patch Available (was: Open) Intermediate columns are incorrectly initialized for partitioned tables. Key: HIVE-5691 URL: https://issues.apache.org/jira/browse/HIVE-5691 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5691.1.patch, HIVE-5691.2.patch, HIVE-5691.3.patch, HIVE-5691.4.patch Intermediate columns are incorrectly initialized for partitioned tables. Same tablescan operator can be used for multiple partitions. The vectorizer doesn't initialize for all partition paths. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5657: --- Status: Open (was: Patch Available) TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817450#comment-13817450 ] Ashutosh Chauhan commented on HIVE-5657: +1 Can you create a follow-up jira for removing unnecessary if(firstRow) from processOp(), seems like work in that if block can be done in initializeOp() ? Also, you need to reupload your patch since seems like Hive QA hasn't picked it up yet. TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Attachment: HIVE-4388.16.patch v16 should fix those failures due to missing deps. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT
[ https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5767: --- Status: Open (was: Patch Available) +1 Can you reupload the patch so that Hive QA gets to run on it? in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT --- Key: HIVE-5767 URL: https://issues.apache.org/jira/browse/HIVE-5767 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HIVE-5767.patch I don't think it's intended. INSERT path consists of a big if statement which prevents most of the code from executing for union case. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates
[ https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5686: --- Status: Open (was: Patch Available) You need to re-upload the patch for Hive QA to kick in. partition column type validation doesn't quite work for dates - Key: HIVE-5686 URL: https://issues.apache.org/jira/browse/HIVE-5686 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5686.patch Another interesting issue... {noformat} hive create table z(c string) partitioned by (i date,j date); OK Time taken: 0.099 seconds hive alter table z add partition (i='2012-01-01', j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type date hive alter table z add partition (i='2012-01-01', j=date 'foo'); OK Time taken: 0.119 seconds {noformat} The fake date is caught in normal queries: {noformat} hive select * from z where j == date 'foo'; FAILED: SemanticException Unable to convert date literal string to date value. {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries
[ https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817477#comment-13817477 ] Ashutosh Chauhan commented on HIVE-5626: Patch looks good. Thanks for refactoring. How did you test this one, by looking at logs ? Is it possible to add junit tests for this as we have added for direct-sql for other cases? enable metastore direct SQL for drop/similar queries Key: HIVE-5626 URL: https://issues.apache.org/jira/browse/HIVE-5626 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch Metastore direct SQL is currently disabled for any queries running inside external transaction (i.e. all modification queries, like dropping stuff). This was done to keep the strictly performance-optimization behavior when using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if direct SQL is broken there's no way to fall back. So, it is disabled for these cases. It is not as important because drop commands are rare, but we might want to address that. Either by some config setting or by making it work on non-postgres DBs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5626) enable metastore direct SQL for drop/similar queries
[ https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5626: --- Status: Open (was: Patch Available) enable metastore direct SQL for drop/similar queries Key: HIVE-5626 URL: https://issues.apache.org/jira/browse/HIVE-5626 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch Metastore direct SQL is currently disabled for any queries running inside external transaction (i.e. all modification queries, like dropping stuff). This was done to keep the strictly performance-optimization behavior when using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if direct SQL is broken there's no way to fall back. So, it is disabled for these cases. It is not as important because drop commands are rare, but we might want to address that. Either by some config setting or by making it work on non-postgres DBs. -- This message was sent by Atlassian JIRA (v6.1#6144)
precommit test
Hi, It seems that pre-commit tests are not running. I wonder if anyone knows why? Thanks, Xuefu
[jira] [Commented] (HIVE-5581) Implement vectorized year/month/day... etc. for string arguments
[ https://issues.apache.org/jira/browse/HIVE-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817499#comment-13817499 ] Eric Hanson commented on HIVE-5581: --- I think there is still a possibility in a pathological case that you could return a value when you should return NULL. See my comment in the code review. It's almost there. Thanks Teddy. Implement vectorized year/month/day... etc. for string arguments Key: HIVE-5581 URL: https://issues.apache.org/jira/browse/HIVE-5581 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-5581.1.patch.txt, HIVE-5581.2.patch, HIVE-5581.3.patch Functions year(), month(), day(), weekofyear(), hour(), minute(), second() need to be implemented for string arguments in vectorized mode. They already work for timestamp arguments. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 14486: HIVE-5441: Async query execution doesn't return resultset status
On Nov. 8, 2013, 12:22 a.m., Thejas Nair wrote: ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 979 https://reviews.apache.org/r/14486/diff/4/?file=380248#file380248line979 OK, I see what you mean. I was looking at just the commented line, and didn't look at the full view-diff page. Yes, the releaseLocks won't get called. That looks like a problem. Thanks to Vaibhav to pointing it out to me. Brock Noland wrote: OK, thanks for responding. I'll open a jira. Brock Noland wrote: https://issues.apache.org/jira/browse/HIVE-5781 I guess the lock are acquired a bit later, just before the execution. We can actually get rid of that releaseLocks() in case of compiler error. - Prasad --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14486/#review28467 --- On Nov. 6, 2013, 11:50 p.m., Prasad Mujumdar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14486/ --- (Updated Nov. 6, 2013, 11:50 p.m.) Review request for hive. Bugs: HIVE-5441 https://issues.apache.org/jira/browse/HIVE-5441 Repository: hive-git Description --- Separate out the query compilation and execute that part synchronously. Diffs - ql/src/java/org/apache/hadoop/hive/ql/Driver.java c09ffde service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 4ee1b74 service/src/test/org/apache/hive/service/cli/CLIServiceTest.java cd9d99a Diff: https://reviews.apache.org/r/14486/diff/ Testing --- Added test cases Thanks, Prasad Mujumdar
Re: precommit test
AFAIK they are running just fine. Last night that was not the case because of a price spike in EC2 spot instances (going to be improved via https://issues.apache.org/jira/browse/HIVE-5782). Long story short, they are queued right now and we can eliminate the queueing once https://issues.apache.org/jira/browse/HADOOP-9765 gets in. Because of a limitation of the precommit system (fixed by HADOOP-9765) we have a rube-goldberg contraption. Right now, the jobs are queued here: https://builds.apache.org/job/PreCommit-HIVE-Build/ That job, simply posts the patches over to: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/ On Fri, Nov 8, 2013 at 11:30 AM, Xuefu Zhang xzh...@cloudera.com wrote: Hi, It seems that pre-commit tests are not running. I wonder if anyone knows why? Thanks, Xuefu -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-4574) XMLEncoder thread safety issues in openjdk7 causes HiveServer2 to be stuck
[ https://issues.apache.org/jira/browse/HIVE-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817506#comment-13817506 ] Ian Robertson commented on HIVE-4574: - For those tracking this issue, the underlying issue is now in the openjdk bug tracker: https://bugs.openjdk.java.net/browse/JDK-8028054 . It's currently scheduled for JDK8; not clear whether it will also be backported to a release of 7. XMLEncoder thread safety issues in openjdk7 causes HiveServer2 to be stuck -- Key: HIVE-4574 URL: https://issues.apache.org/jira/browse/HIVE-4574 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0, 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4574.1.patch In open jdk7, XMLEncoder.writeObject call leads to calls to java.beans.MethodFinder.findMethod(). MethodFinder class not thread safe because it uses a static WeakHashMap that would get used from multiple threads. See - http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/com/sun/beans/finder/MethodFinder.java#46 Concurrent access to HashMap implementation that are not thread safe can sometimes result in infinite-loops and other problems. If jdk7 is in use, it makes sense to synchronize calls to XMLEncoder.writeObject . -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817507#comment-13817507 ] Ashutosh Chauhan commented on HIVE-5700: Can you create a RB entry for this ? enforce single date format for partition column storage --- Key: HIVE-5700 URL: https://issues.apache.org/jira/browse/HIVE-5700 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5700.01.patch, HIVE-5700.patch inspired by HIVE-5286. Partition column for dates should be stored as either integer, or as fixed representation e.g. -mm-dd. External representation can remain varied as is. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: precommit test
Thanks for the info, Brock. Yesterday I submitted a couple of patches, expecting to get the result this morning. However, they didn't run. I had to manually submit the request this morning. Right now they are waiting in queue. --Xuefu On Fri, Nov 8, 2013 at 9:48 AM, Brock Noland br...@cloudera.com wrote: AFAIK they are running just fine. Last night that was not the case because of a price spike in EC2 spot instances (going to be improved via https://issues.apache.org/jira/browse/HIVE-5782). Long story short, they are queued right now and we can eliminate the queueing once https://issues.apache.org/jira/browse/HADOOP-9765 gets in. Because of a limitation of the precommit system (fixed by HADOOP-9765) we have a rube-goldberg contraption. Right now, the jobs are queued here: https://builds.apache.org/job/PreCommit-HIVE-Build/ That job, simply posts the patches over to: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/ On Fri, Nov 8, 2013 at 11:30 AM, Xuefu Zhang xzh...@cloudera.com wrote: Hi, It seems that pre-commit tests are not running. I wonder if anyone knows why? Thanks, Xuefu -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Review Request 15359: HIVE-5700 enforce single date format for partition column storage
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15359/ --- Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- see JIRA Diffs - metastore/scripts/upgrade/mysql/upgrade-0.12.0-to-0.13.0.mysql.sql 04e4a87 metastore/scripts/upgrade/oracle/upgrade-0.12.0-to-0.13.0.oracle.sql 8847d3e metastore/scripts/upgrade/postgres/upgrade-0.12.0-to-0.13.0.postgres.sql 01cbe76 ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 46d1fac ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5305537 Diff: https://reviews.apache.org/r/15359/diff/ Testing --- tests; manual verification of mysql and psql scripts (Oracle tbd) Thanks, Sergey Shelukhin
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817558#comment-13817558 ] Hive QA commented on HIVE-4388: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612848/HIVE-4388.16.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4597 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_cascade_dbdrop_hadoop20 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/209/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/209/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12612848 HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817559#comment-13817559 ] Sergey Shelukhin commented on HIVE-5700: https://reviews.apache.org/r/15359/ enforce single date format for partition column storage --- Key: HIVE-5700 URL: https://issues.apache.org/jira/browse/HIVE-5700 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5700.01.patch, HIVE-5700.patch inspired by HIVE-5286. Partition column for dates should be stored as either integer, or as fixed representation e.g. -mm-dd. External representation can remain varied as is. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5626) enable metastore direct SQL for drop/similar queries
[ https://issues.apache.org/jira/browse/HIVE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817564#comment-13817564 ] Sergey Shelukhin commented on HIVE-5626: all the .q tests are ran with verifying object store which runs SQL and JDO and compares. Separate test - what do you mean? enable metastore direct SQL for drop/similar queries Key: HIVE-5626 URL: https://issues.apache.org/jira/browse/HIVE-5626 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5626.01.patch, HIVE-5626.02.patch, HIVE-5626.patch Metastore direct SQL is currently disabled for any queries running inside external transaction (i.e. all modification queries, like dropping stuff). This was done to keep the strictly performance-optimization behavior when using Postgres, which unlike other RDBMS-es fails the tx on any syntax error; so, if direct SQL is broken there's no way to fall back. So, it is disabled for these cases. It is not as important because drop commands are rare, but we might want to address that. Either by some config setting or by making it work on non-postgres DBs. -- This message was sent by Atlassian JIRA (v6.1#6144)
Subscribe to List
[jira] [Created] (HIVE-5783) Native Parquet Support in Hive
Justin Coffey created HIVE-5783: --- Summary: Native Parquet Support in Hive Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Priority: Minor Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817574#comment-13817574 ] Sergey Shelukhin commented on HIVE-5657: btw please don't commit, let me address my own comments on rb TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817572#comment-13817572 ] Sergey Shelukhin commented on HIVE-5657: seems like too small an item to create JIRA for... also are you sure it can indeed be moved? see my TODO comment. As a matter of priorities I'd like to not spend time making sure ;) TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates
[ https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5686: --- Status: Patch Available (was: Open) partition column type validation doesn't quite work for dates - Key: HIVE-5686 URL: https://issues.apache.org/jira/browse/HIVE-5686 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5686.01.patch, HIVE-5686.patch Another interesting issue... {noformat} hive create table z(c string) partitioned by (i date,j date); OK Time taken: 0.099 seconds hive alter table z add partition (i='2012-01-01', j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type date hive alter table z add partition (i='2012-01-01', j=date 'foo'); OK Time taken: 0.119 seconds {noformat} The fake date is caught in normal queries: {noformat} hive select * from z where j == date 'foo'; FAILED: SemanticException Unable to convert date literal string to date value. {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT
[ https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5767: --- Attachment: HIVE-5767.01.patch same patch in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT --- Key: HIVE-5767 URL: https://issues.apache.org/jira/browse/HIVE-5767 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HIVE-5767.01.patch, HIVE-5767.patch I don't think it's intended. INSERT path consists of a big if statement which prevents most of the code from executing for union case. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817579#comment-13817579 ] Ashutosh Chauhan commented on HIVE-5700: You also need to add script for derby. Also, if you can test your Oracle script, that will be good. Also, a -ve test case which rejects date like 2013-1-1 as partitioning column will be good to include. enforce single date format for partition column storage --- Key: HIVE-5700 URL: https://issues.apache.org/jira/browse/HIVE-5700 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5700.01.patch, HIVE-5700.patch inspired by HIVE-5286. Partition column for dates should be stored as either integer, or as fixed representation e.g. -mm-dd. External representation can remain varied as is. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5686) partition column type validation doesn't quite work for dates
[ https://issues.apache.org/jira/browse/HIVE-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5686: --- Attachment: HIVE-5686.01.patch same patch partition column type validation doesn't quite work for dates - Key: HIVE-5686 URL: https://issues.apache.org/jira/browse/HIVE-5686 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5686.01.patch, HIVE-5686.patch Another interesting issue... {noformat} hive create table z(c string) partitioned by (i date,j date); OK Time taken: 0.099 seconds hive alter table z add partition (i='2012-01-01', j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type date hive alter table z add partition (i='2012-01-01', j=date 'foo'); OK Time taken: 0.119 seconds {noformat} The fake date is caught in normal queries: {noformat} hive select * from z where j == date 'foo'; FAILED: SemanticException Unable to convert date literal string to date value. {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5767) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT
[ https://issues.apache.org/jira/browse/HIVE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5767: --- Status: Patch Available (was: Open) in SemanticAnalyzer#doPhase1, handling for TOK_UNION falls thru into TOK_INSERT --- Key: HIVE-5767 URL: https://issues.apache.org/jira/browse/HIVE-5767 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HIVE-5767.01.patch, HIVE-5767.patch I don't think it's intended. INSERT path consists of a big if statement which prevents most of the code from executing for union case. -- This message was sent by Atlassian JIRA (v6.1#6144)
Scheduling the next Hive Contributors Meeting
We're long overdue for a Hive Contributors Meeting. Thejas has offered to host the next meeting at Hortonworks on November 19th from 4-6pm. We will have a Google Hangout or Webex setup for people who wish to attend remotely. If you want to attend but can't because of a scheduling conflict please let us know. If enough people fall into this category we will try to reschedule. Thanks. Carl
[jira] [Commented] (HIVE-5286) Negative test date_literal1.q fails on java7 because the syntax is valid
[ https://issues.apache.org/jira/browse/HIVE-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817591#comment-13817591 ] Xuefu Zhang commented on HIVE-5286: --- +1, patch looks good to me. Negative test date_literal1.q fails on java7 because the syntax is valid Key: HIVE-5286 URL: https://issues.apache.org/jira/browse/HIVE-5286 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Szehon Ho Attachments: HIVE-5286.patch {noformat} [brock@bigboy java-date]$ cat Test.java import java.sql.Date; public class Test { public static void main(String[] args) throws Exception { System.out.println(Date.valueOf(2001-1-1)); } } [brock@bigboy java-date]$ exec-via-java6 java -cp . Test Exception in thread main java.lang.IllegalArgumentException at java.sql.Date.valueOf(Date.java:138) at Test.main(Test.java:4) [brock@bigboy java-date]$ exec-via-java7 java -cp . Test 2001-01-01 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5657: --- Attachment: HIVE-5657.03.patch trivial changes compared to 02 TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.03.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5657) TopN produces incorrect results with count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5657: --- Status: Patch Available (was: Open) TopN produces incorrect results with count(distinct) Key: HIVE-5657 URL: https://issues.apache.org/jira/browse/HIVE-5657 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, HIVE-5657.03.patch, HIVE-5657.1.patch.txt, example.patch Attached patch illustrates the problem. limit_pushdown test has various other cases of aggregations and distincts, incl. count-distinct, that work correctly (that said, src dataset is bad for testing these things because every count, for example, produces one record only), so something must be special about this. I am not very familiar with distinct- code and these nuances; if someone knows a quick fix feel free to take this, otherwise I will probably start looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5700) enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817604#comment-13817604 ] Sergey Shelukhin commented on HIVE-5700: I was hoping to avoid writing Derby upgrade script... it;s not DB structure - do people really need to upgrade derby? Hmm. As for negative tests, the problem is that date validation on JDK6 is going to kick in first and reject the date literal before this code executes... the only way to allow it is to run JDK7, for example. Let me think about more isolated test enforce single date format for partition column storage --- Key: HIVE-5700 URL: https://issues.apache.org/jira/browse/HIVE-5700 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5700.01.patch, HIVE-5700.patch inspired by HIVE-5286. Partition column for dates should be stored as either integer, or as fixed representation e.g. -mm-dd. External representation can remain varied as is. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817609#comment-13817609 ] Carl Steinbach commented on HIVE-5783: -- [~jcoffey] I added you to the list of Hive contributors on JIRA. Feel free to assign this ticket to yourself. Thanks. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Reporter: Justin Coffey Priority: Minor Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Scheduling the next Hive Contributors Meeting
Hi, Thanks Carl and Thejas! I would be attending remotely so the webex or google hangout would be very much appreciated. Please let me know if there is anything I can do to help enable either a webex or hangout! The Apache Sentry (incubating)[1] community which depends on Hive would be interested in briefly describing the project to the Hive community and discuss how we can work together to move both projects forward! As a side note, there have been lively discussions on the integration of other incubating projects therefore I'd just like to share that the changes Sentry is interested in are very small in scope and unlikely to cause disruption to the Hive community. Cheers! Brock [1] http://incubator.apache.org/projects/sentry.html On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote: We're long overdue for a Hive Contributors Meeting. Thejas has offered to host the next meeting at Hortonworks on November 19th from 4-6pm. We will have a Google Hangout or Webex setup for people who wish to attend remotely. If you want to attend but can't because of a scheduling conflict please let us know. If enough people fall into this category we will try to reschedule. Thanks. Carl
[jira] [Created] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
Harish Butani created HIVE-5784: --- Summary: Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15361: HIVE-5784: Group By Operator doesn't carry forward table aliases in its RowResolver
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15361/ --- Review request for hive and Ashutosh Chauhan. Bugs: hive-5784 https://issues.apache.org/jira/browse/hive-5784 Repository: hive-git Description --- The following queries fails: select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java 908546e ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5305537 ql/src/test/queries/clientpositive/groupby_resolution.q PRE-CREATION ql/src/test/results/clientpositive/groupby_resolution.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15361/diff/ Testing --- added test groupby_resolution.q Thanks, Harish Butani
Re: Scheduling the next Hive Contributors Meeting
I am not a contributor but a spectator to what hive have been doing last couple of years. I work out of India and would love to just sit back and listen to all the new upcoming things (if that's allowed) :) On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote: Hi, Thanks Carl and Thejas! I would be attending remotely so the webex or google hangout would be very much appreciated. Please let me know if there is anything I can do to help enable either a webex or hangout! The Apache Sentry (incubating)[1] community which depends on Hive would be interested in briefly describing the project to the Hive community and discuss how we can work together to move both projects forward! As a side note, there have been lively discussions on the integration of other incubating projects therefore I'd just like to share that the changes Sentry is interested in are very small in scope and unlikely to cause disruption to the Hive community. Cheers! Brock [1] http://incubator.apache.org/projects/sentry.html On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote: We're long overdue for a Hive Contributors Meeting. Thejas has offered to host the next meeting at Hortonworks on November 19th from 4-6pm. We will have a Google Hangout or Webex setup for people who wish to attend remotely. If you want to attend but can't because of a scheduling conflict please let us know. If enough people fall into this category we will try to reschedule. Thanks. Carl -- Nitin Pawar
[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817616#comment-13817616 ] Hive QA commented on HIVE-5356: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612668/HIVE-5356.4.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4636 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_pmod org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_arithmetic_type {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/210/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/210/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12612668 Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5784: Attachment: HIVE-5784.1.patch Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817617#comment-13817617 ] Harish Butani commented on HIVE-5784: - Review request: https://reviews.apache.org/r/15361/ Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5784: Status: Patch Available (was: Open) Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions
[ https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5557: Status: Open (was: Patch Available) Push down qualifying Where clause predicates as join conditions --- Key: HIVE-5557 URL: https://issues.apache.org/jira/browse/HIVE-5557 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, HIVE-5557.4.patch See details in HIVE- -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5557) Push down qualifying Where clause predicates as join conditions
[ https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817620#comment-13817620 ] Harish Butani commented on HIVE-5557: - Re-submit for Hive QA to pick up. Push down qualifying Where clause predicates as join conditions --- Key: HIVE-5557 URL: https://issues.apache.org/jira/browse/HIVE-5557 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, HIVE-5557.4.patch See details in HIVE- -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Scheduling the next Hive Contributors Meeting
Hi, On Fri, Nov 8, 2013 at 1:43 PM, Nitin Pawar nitinpawar...@gmail.com wrote: I am not a contributor but a spectator to what hive have been doing last couple of years. I work out of India and would love to just sit back and listen to all the new upcoming things (if that's allowed) :) Not only allowed, but encouraged! Great to have your interest! On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote: Hi, Thanks Carl and Thejas! I would be attending remotely so the webex or google hangout would be very much appreciated. Please let me know if there is anything I can do to help enable either a webex or hangout! The Apache Sentry (incubating)[1] community which depends on Hive would be interested in briefly describing the project to the Hive community and discuss how we can work together to move both projects forward! As a side note, there have been lively discussions on the integration of other incubating projects therefore I'd just like to share that the changes Sentry is interested in are very small in scope and unlikely to cause disruption to the Hive community. Cheers! Brock [1] http://incubator.apache.org/projects/sentry.html On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote: We're long overdue for a Hive Contributors Meeting. Thejas has offered to host the next meeting at Hortonworks on November 19th from 4-6pm. We will have a Google Hangout or Webex setup for people who wish to attend remotely. If you want to attend but can't because of a scheduling conflict please let us know. If enough people fall into this category we will try to reschedule. Thanks. Carl -- Nitin Pawar -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Updated] (HIVE-5557) Push down qualifying Where clause predicates as join conditions
[ https://issues.apache.org/jira/browse/HIVE-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5557: Status: Patch Available (was: Open) Push down qualifying Where clause predicates as join conditions --- Key: HIVE-5557 URL: https://issues.apache.org/jira/browse/HIVE-5557 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5557.1.patch, HIVE-5557.2.patch, HIVE-5557.3.patch, HIVE-5557.4.patch See details in HIVE- -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5785) Hive Metadata Thinks Table Partitions Are All Strings
Brad Ruderman created HIVE-5785: --- Summary: Hive Metadata Thinks Table Partitions Are All Strings Key: HIVE-5785 URL: https://issues.apache.org/jira/browse/HIVE-5785 Project: Hive Issue Type: Bug Components: Database/Schema Reporter: Brad Ruderman Priority: Minor hive (bruderman) CREATE TABLE test (a int, b int) partitioned by (dt int); OK Time taken: 0.101 seconds hive (bruderman) desc test; OK col_namedata_type comment a int b int dt int Time taken: 0.093 seconds hive (bruderman) CREATE VIEW v_test AS SELECT * FROM test; OK a b dt Time taken: 0.042 seconds hive (bruderman) desc v_test; OK col_namedata_type comment a int b int dt string Time taken: 0.098 seconds hive (bruderman) -- When I have a table which is partitioned by an int/bigint, and I go to import that table into Tableau, Tableau detects the partition column as being a string thus I cannot use it for incremental refreshes. I thought it was a tableau bug, however when creating a view: select * from table, then describing the view, I see that the partition column is a string, thus I think the issue is within hive. Finally the issue extends when interfacing through hive server 1/hive server 2: (hs2)➜ pyhs2 git:(master) python test.py None [{'comment': None, 'columnName': 'a', 'type': 'INT_TYPE'}, {'comment': None, 'columnName': 'b', 'type': 'INT_TYPE'}, {'comment': None, 'columnName': 'dt', 'type': 'STRING_TYPE'}] Where the column is detected a string. The workaround is to create a view: CREATE VIEW v_test AS SELECT t.*, CAST(t.dt as INT) dt_part FROM test And using that. This issue extended beyond Tableau and affects anything using the HiveServer1/2. Thanks! -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817630#comment-13817630 ] Xuefu Zhang commented on HIVE-5784: --- [~rhbutani] This seems a dupe of HIVE-3107, which keeps the previous discussions. There is no point to keep two. Feel free to close this one and take that one if you're are going to work on this now. Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5562) Provide stripe level column statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817637#comment-13817637 ] Gunther Hagleitner commented on HIVE-5562: -- Committed to trunk. Thanks [~prasanth_j] and [~owen.omalley]! Provide stripe level column statistics in ORC - Key: HIVE-5562 URL: https://issues.apache.org/jira/browse/HIVE-5562 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-5562.1.patch.txt, HIVE-5562.2.patch.txt ORC maintains two levels of column statistics. Index statistics (for every rowgroup) and file level column statistics for the entire file. It is useful to have stripe level column statistics which will be intermediate to index and file statistics. The reason to maintain stripe level statistics is that, the current input split computation logic is based on stripe boundaries. So if stripe level statistics are available and if a stripe doesn't satisfy a predicate condition then that entire stripe (also split) can be eliminated from split computation. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5562) Provide stripe level column statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5562: - Resolution: Fixed Status: Resolved (was: Patch Available) Provide stripe level column statistics in ORC - Key: HIVE-5562 URL: https://issues.apache.org/jira/browse/HIVE-5562 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-5562.1.patch.txt, HIVE-5562.2.patch.txt ORC maintains two levels of column statistics. Index statistics (for every rowgroup) and file level column statistics for the entire file. It is useful to have stripe level column statistics which will be intermediate to index and file statistics. The reason to maintain stripe level statistics is that, the current input split computation logic is based on stripe boundaries. So if stripe level statistics are available and if a stripe doesn't satisfy a predicate condition then that entire stripe (also split) can be eliminated from split computation. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5745) TestHiveLogging is failing (at least on mac)
[ https://issues.apache.org/jira/browse/HIVE-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5745: - Status: Patch Available (was: Open) TestHiveLogging is failing (at least on mac) Key: HIVE-5745 URL: https://issues.apache.org/jira/browse/HIVE-5745 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-5745.1.patch The path for the log file on my mac contains two slashes. That causes mvn install fail. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Attachment: HIVE-4388.17.patch v17 should fix that last failure :) HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Scheduling the next Hive Contributors Meeting
Looking forward to it! I would like to do a status update and quick demo of the Tez integration work (HIVE-4660), if there is time and interest. Thanks, Gunther. On Fri, Nov 8, 2013 at 11:44 AM, Brock Noland br...@cloudera.com wrote: Hi, On Fri, Nov 8, 2013 at 1:43 PM, Nitin Pawar nitinpawar...@gmail.com wrote: I am not a contributor but a spectator to what hive have been doing last couple of years. I work out of India and would love to just sit back and listen to all the new upcoming things (if that's allowed) :) Not only allowed, but encouraged! Great to have your interest! On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote: Hi, Thanks Carl and Thejas! I would be attending remotely so the webex or google hangout would be very much appreciated. Please let me know if there is anything I can do to help enable either a webex or hangout! The Apache Sentry (incubating)[1] community which depends on Hive would be interested in briefly describing the project to the Hive community and discuss how we can work together to move both projects forward! As a side note, there have been lively discussions on the integration of other incubating projects therefore I'd just like to share that the changes Sentry is interested in are very small in scope and unlikely to cause disruption to the Hive community. Cheers! Brock [1] http://incubator.apache.org/projects/sentry.html On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote: We're long overdue for a Hive Contributors Meeting. Thejas has offered to host the next meeting at Hortonworks on November 19th from 4-6pm. We will have a Google Hangout or Webex setup for people who wish to attend remotely. If you want to attend but can't because of a scheduling conflict please let us know. If enough people fall into this category we will try to reschedule. Thanks. Carl -- Nitin Pawar -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5741) Hcatalog needs to be added to the binary tar
[ https://issues.apache.org/jira/browse/HIVE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817653#comment-13817653 ] Brock Noland commented on HIVE-5741: On this item I think we should just keep the project structure as-is. Hcatalog needs to be added to the binary tar Key: HIVE-5741 URL: https://issues.apache.org/jira/browse/HIVE-5741 Project: Hive Issue Type: Sub-task Reporter: Brock Noland -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5714) Separate reactor root or aggregator from parent pom
[ https://issues.apache.org/jira/browse/HIVE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817652#comment-13817652 ] Brock Noland commented on HIVE-5714: [~abayer] would you have a couple minutes to share your best practices? As I understand it the root pom should *only* do aggregation while the parent pom should do everything that is inherited by the modules. Is that correct? Separate reactor root or aggregator from parent pom --- Key: HIVE-5714 URL: https://issues.apache.org/jira/browse/HIVE-5714 Project: Hive Issue Type: Sub-task Reporter: Brock Noland It's a best practice to have a separate reactor pom from parent pom. More details in FLUME-2199. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5564) Need to accomodate table decimal columns that were defined prior to HIVE-3976
[ https://issues.apache.org/jira/browse/HIVE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817651#comment-13817651 ] Hive QA commented on HIVE-5564: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612688/HIVE-5564.4.patch {color:green}SUCCESS:{color} +1 4595 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/211/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/211/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612688 Need to accomodate table decimal columns that were defined prior to HIVE-3976 - Key: HIVE-5564 URL: https://issues.apache.org/jira/browse/HIVE-5564 Project: Hive Issue Type: Task Components: Types Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5564.1.patch, HIVE-5564.2.patch, HIVE-5564.3.patch, HIVE-5564.4.patch, HIVE-5564.patch With HIVE-3976, decimal columns are stored with precision/scale, such as decimal(17,5), as the type name. However, such columns defined in hive prior to HIVE-3976 have a name as decimal. Those columns need to continue to work with a precision/scale as (10,0), per the functional doc. With patch in HIVE-3976, we may get the following error message in such case: {code} 0: jdbc:hive2://localhost:1 desc dec; Error: Error while processing statement: FAILED: RuntimeException Decimal type is specified without length: decimal:int (state=42000,code=4) {code} This issue will be addressed in this JIRA as a follow-up task. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5601) NPE in ORC's PPD when using select * from table with where predicate
[ https://issues.apache.org/jira/browse/HIVE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817654#comment-13817654 ] Gunther Hagleitner commented on HIVE-5601: -- Committed the patch to trunk. Haven't updated hive .12 yet. NPE in ORC's PPD when using select * from table with where predicate - Key: HIVE-5601 URL: https://issues.apache.org/jira/browse/HIVE-5601 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Labels: ORC Attachments: HIVE-5601.4-branch-0.12.patch.txt, HIVE-5601.5.patch.txt, HIVE-5601.branch-0.12.2.patch.txt, HIVE-5601.branch-0.12.3.patch.txt, HIVE-5601.branch-0.12.4.patch.txt, HIVE-5601.branch-12.1.patch.txt, HIVE-5601.trunk.1.patch.txt, HIVE-5601.trunk.2.patch.txt, HIVE-5601.trunk.3.patch.txt, HIVE-5601.trunk.4.patch.txt, HIVE-5601.trunk.5.patch.txt ORCInputFormat has a method findIncludedColumns() which returns boolean array of included columns. In case of the following query {code}select * from qlog_orc where id1000 limit 10;{code} where all columns are selected the findIncludedColumns() returns null. This will result in a NPE when PPD is enabled. Following is the stack trace {code}Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.planReadPartialDataStreams(RecordReaderImpl.java:2387) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:2543) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:2200) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:2573) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:2615) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.init(RecordReaderImpl.java:132) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rows(ReaderImpl.java:348) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.init(OrcInputFormat.java:99) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:241) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:237) ... 8 more{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15151: Better error reporting by async threads in HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15151/#review28572 --- @Vaibhav, thanks for taking the issue forward and putting a new patch! I do have a high level comment on the approach. The 'status' returned by HS2 RPC is suppose to be the status of that particular API's execution. Where as in this case, we are overloading the 'status' field to return the status of a different (ie. ExecuteStatement()) status. For example, if you call GetStatus() with a non-existing operation id then you would get an error status. This error is for failure of the GetStatus() itself. On the other hand if you call GetStatus() for an async query that failed, then you will also get the error status. However this error is not for the current GetStatus() operation, but for the last ExecuteStatement() operation. The current implementation of the JDBC driver (or CLIClient in general) will work with this, but perhaps its not a clean way to implement it. Have you considered adding a new field in the GetStatus response to return the error status of the actual execute operation ? - Prasad Mujumdar On Nov. 1, 2013, 12:54 a.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15151/ --- (Updated Nov. 1, 2013, 12:54 a.m.) Review request for hive, Prasad Mujumdar and Thejas Nair. Bugs: HIVE-5230 https://issues.apache.org/jira/browse/HIVE-5230 Repository: hive-git Description --- [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. When a background thread gets an error, currently the client can only poll for the operation state and also the error with its stacktrace is logged. However, it will be useful to provide a richer error response like thrift API does with TStatus (which is constructed while building a Thrift response object). Diffs - service/src/java/org/apache/hive/service/cli/CLIService.java 1a7f338 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 14ef54f service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java 9dca874 service/src/java/org/apache/hive/service/cli/ICLIService.java f647ce6 service/src/java/org/apache/hive/service/cli/OperationStatus.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/Operation.java 6f4b8dc service/src/java/org/apache/hive/service/cli/operation/OperationManager.java bcdb67f service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java f6adf92 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 9df110e service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 9bb2a0f service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d6caed1 service/src/test/org/apache/hive/service/cli/thrift/ThriftCLIServiceTest.java ff7166d Diff: https://reviews.apache.org/r/15151/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-5754) NullPointerException when alter partition table and table does not exist
[ https://issues.apache.org/jira/browse/HIVE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817678#comment-13817678 ] Xuefu Zhang commented on HIVE-5754: --- This seems to be a bug. However, did you try to reproduce with the latest trunk? NullPointerException when alter partition table and table does not exist Key: HIVE-5754 URL: https://issues.apache.org/jira/browse/HIVE-5754 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Alexis Deltour I have a problem with my oozie hive action which clean my hive table, and when my table doesn't exist and i alter partition table, i obtain different messages with 2 versions of hive : Sur CDH3 hive 0.7.1 : hive ALTER TABLE mytable DROP IF EXISTS PARTITION (mypart='10'); FAILED: Error in semantic analysis: Table not found mytable -- Oozie action OK. Sur CDH4 hive 0.10.0 : hive ALTER TABLE mytable DROP IF EXISTS PARTITION (mypart='10'); FAILED: NullPointerException null -- Oozie action in error. Is this a bug or a configuration problem ? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817684#comment-13817684 ] Hive QA commented on HIVE-4388: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612876/HIVE-4388.17.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4598 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/212/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/212/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12612876 HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817694#comment-13817694 ] Brock Noland commented on HIVE-4388: Sweet, that test is flaky and not related. [~hagleitn] or [~ashutoshc] should we get this one in? HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4388) Upgrade HBase to 0.96
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Summary: Upgrade HBase to 0.96 (was: HBase tests fail against Hadoop 2) Upgrade HBase to 0.96 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt, HIVE-4388.10.patch, HIVE-4388.11.patch, HIVE-4388.12.patch, HIVE-4388.13.patch, HIVE-4388.14.patch, HIVE-4388.15.patch, HIVE-4388.15.patch, HIVE-4388.16.patch, HIVE-4388.17.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817706#comment-13817706 ] Gunther Hagleitner commented on HIVE-5632: -- I've committed the test data file (orc_split_elim.orc) to trunk (in data/files/orc_split_elim.orc). That doesn't affect anything in the build, but now the pre-commit tests should be able to run. Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5784: Status: Open (was: Patch Available) Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani resolved HIVE-5784. - Resolution: Duplicate duplicate of HIVE-3107 Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5632: - Status: Patch Available (was: Open) Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5784) Group By Operator doesn't carry forward table aliases in its RowResolver
[ https://issues.apache.org/jira/browse/HIVE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817708#comment-13817708 ] Harish Butani commented on HIVE-5784: - [~xuefuz] ok thanks for pointing this out. Can you review the patch. Group By Operator doesn't carry forward table aliases in its RowResolver Key: HIVE-5784 URL: https://issues.apache.org/jira/browse/HIVE-5784 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-5784.1.patch The following queries fails: {code} select b.key, count(*) from src b group by key select key, count(*) from src b group by b.key {code} with a SemanticException; the select expression b.key (key in the 2nd query) are not resolved by the GBy RowResolver. This is because the GBy RowResolver only supports resolving based on an AST.toStringTree match. Underlying issue is that a RowResolver doesn't allow multiple mappings to the same ColumnInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5632: - Attachment: HIVE-5632.4.patch Re-uploading .3 as .4 to kick off pre-commit. Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3107: Attachment: HIVE-3107.1.patch Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0 Reporter: Richard Ding Assignee: Xuefu Zhang Attachments: HIVE-3107.1.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5632: - Status: Open (was: Patch Available) Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817711#comment-13817711 ] Harish Butani commented on HIVE-3107: - Review request: https://reviews.apache.org/r/15361/ Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0 Reporter: Richard Ding Assignee: Xuefu Zhang Attachments: HIVE-3107.1.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3107: Status: Patch Available (was: Reopened) Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0 Reporter: Richard Ding Assignee: Harish Butani Attachments: HIVE-3107.1.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani reassigned HIVE-3107: --- Assignee: Harish Butani (was: Xuefu Zhang) Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0 Reporter: Richard Ding Assignee: Harish Butani Attachments: HIVE-3107.1.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5779) Subquery in where clause with distinct fails with mapjoin turned on with serialization error.
[ https://issues.apache.org/jira/browse/HIVE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817735#comment-13817735 ] Hive QA commented on HIVE-5779: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12612843/HIVE-5779.2.patch {color:green}SUCCESS:{color} +1 4597 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/213/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/213/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12612843 Subquery in where clause with distinct fails with mapjoin turned on with serialization error. - Key: HIVE-5779 URL: https://issues.apache.org/jira/browse/HIVE-5779 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-5779.2.patch, HIVE-5779.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
Jason Dere created HIVE-5786: Summary: Remove HadoopShims methods that were needed for pre-Hadoop 0.20 Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817738#comment-13817738 ] Jason Dere commented on HIVE-5786: -- Looks like the following shims methods can be removed from HadoopShims: usesJobShell isJobPreparing fileSystemDeleteOnExit inputFormatValidateInput setTmpFiles getAccessTime compareText setFloatConf getTaskJobIDs Remove HadoopShims methods that were needed for pre-Hadoop 0.20 --- Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5786: - Attachment: HIVE-5786.1.patch patch v1. Remove HadoopShims methods that were needed for pre-Hadoop 0.20 --- Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5786.1.patch There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5107) Change hive's build to maven
[ https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817753#comment-13817753 ] Vaibhav Gumashta commented on HIVE-5107: Hi [~brocknoland]! Thanks for the awesome effort. I have one question regarding the organization of tests. Some of the tests have been moved to the itests folder whereas some live in the original package. Is there a good reason for having that structure? For example, some of the unit test files for the service package live in service/src/test/org/apache/hive/service, while some of them have moved to itests/hive-unit/src/test/java/org/apache/hive/service. Change hive's build to maven Key: HIVE-5107 URL: https://issues.apache.org/jira/browse/HIVE-5107 Project: Hive Issue Type: Task Reporter: Edward Capriolo Assignee: Edward Capriolo I can not cope with hive's build infrastructure any more. I have started working on porting the project to maven. When I have some solid progess i will github the entire thing for review. Then we can talk about switching the project somehow. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5107) Change hive's build to maven
[ https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817767#comment-13817767 ] Brock Noland commented on HIVE-5107: itests are any tests that has cyclical dependencies or requires that the packages be built. Typically only integration tests that have those requirements, thus I have named it itests. Change hive's build to maven Key: HIVE-5107 URL: https://issues.apache.org/jira/browse/HIVE-5107 Project: Hive Issue Type: Task Reporter: Edward Capriolo Assignee: Edward Capriolo I can not cope with hive's build infrastructure any more. I have started working on porting the project to maven. When I have some solid progess i will github the entire thing for review. Then we can talk about switching the project somehow. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15373: HIVE-5786 Remove HadoopShims methods that were needed for pre-Hadoop 0.20
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15373/ --- Review request for hive. Bugs: HIVE-5786 https://issues.apache.org/jira/browse/HIVE-5786 Repository: hive-git Description --- Remove some of the shims methods which were made obsolete after dropping pre-hadoop 0.20 support. Diffs - cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 4fcca8c common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4f32390 contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextInputFormat.java 5909188 contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax.java abb66c4 contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMin.java 6f389d8 itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/UDAFTestMax.java eda2aa4 ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 2ac22b7 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e69aaa6 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 0a2f976 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java 7b77944 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 99ec216 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java f7086a3 ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java ab884c5 ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateMapper.java f0678ef ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java a85a19d ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 0f48674 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrGreaterThan.java cf39215 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrLessThan.java 3eba13b ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPGreaterThan.java d6654a1 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPLessThan.java b1e03b4 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 1d92d40 serde/src/java/org/apache/hadoop/hive/serde2/io/HiveCharWritable.java e68c63a serde/src/java/org/apache/hadoop/hive/serde2/io/HiveVarcharWritable.java 005832b serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java ba8342d shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 17f4a94 shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java fd0d526 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 62ff878 Diff: https://reviews.apache.org/r/15373/diff/ Testing --- Thanks, Jason Dere
[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements
[ https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817769#comment-13817769 ] Adrian Hains commented on HIVE-4022: I found a workaround to get me past this restriction. I had a need to add some struct columns to a table t1 by way of copying the data to a new table t2 with the correct updated schema. Trying to insert directly to t2 by selecting from t1 with null literals failed for me as described in this jira ticket. To work around this I created an additional table t2copy that has the same schema as t2. Then I did an insert to t1 selecting from t2 left outer join t2copy, and referencing the t2copy.newStructColumn instance to have a table-sourced null value pass to t1. This worked. It may be that t2copy having the same struct definition is unnecessary, and a simple empty table with a bogus struct column definition would have worked just as well. Structs and struct fields cannot be NULL in INSERT statements - Key: HIVE-4022 URL: https://issues.apache.org/jira/browse/HIVE-4022 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Michael Malak Originally thought to be Avro-specific, and first noted with respect to HIVE-3528 Avro SerDe doesn't handle serializing Nullable types that require access to a Schema, it turns out even native Hive tables cannot store NULL in a STRUCT field or for the entire STRUCT itself, at least when the NULL is specified directly in the INSERT statement. Again, this affects both Avro-backed tables and native Hive tables. ***For native Hive tables: The following: echo 1,2 twovalues.csv hive CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc; CREATE TABLE oc (z STRUCTa: int, b: int); INSERT INTO TABLE oc SELECT null FROM tc; produces the error FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from void to structa:int,b:int. The following: INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc; produces the error: FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from structa:void,b:void to structa:int,b:int. ***For Avro: In HIVE-3528, there is in fact a null-struct test case in line 14 of https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt The test script at https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q does indeed work. But in that test, the query gets all of its data from a test table verbatim: INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer; If instead we stick in a hard-coded null for the struct directly into the query, it fails: INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, bytes1, fixed1 FROM test_serializer; with the following error: FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different 'as_avro': Cannot convert column 10 from void to structsint:int,sboolean:boolean,sstring:string. Note, though, that substituting a hard-coded null for string1 (and restoring struct1 into the query) does work: INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, bytes1, fixed1 FROM test_serializer; -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817770#comment-13817770 ] Jason Dere commented on HIVE-5786: -- RB at https://reviews.apache.org/r/15373/ Remove HadoopShims methods that were needed for pre-Hadoop 0.20 --- Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5786.1.patch There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5786) Remove HadoopShims methods that were needed for pre-Hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5786: - Status: Patch Available (was: Open) Remove HadoopShims methods that were needed for pre-Hadoop 0.20 --- Key: HIVE-5786 URL: https://issues.apache.org/jira/browse/HIVE-5786 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5786.1.patch There are several methods in HadoopShims that can be removed since we are only supporting 0.20+. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5632) Eliminate splits based on SARGs using stripe statistics in ORC
[ https://issues.apache.org/jira/browse/HIVE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817785#comment-13817785 ] Gunther Hagleitner commented on HIVE-5632: -- looked at the revised patch. LGTM +1. [~prasanth_j] can you open the follow up jira discussed and link to this? Eliminate splits based on SARGs using stripe statistics in ORC -- Key: HIVE-5632 URL: https://issues.apache.org/jira/browse/HIVE-5632 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5632.1.patch.txt, HIVE-5632.2.patch.txt, HIVE-5632.3.patch.txt, HIVE-5632.4.patch, orc_split_elim.orc HIVE-5562 provides stripe level statistics in ORC. Stripe level statistics combined with predicate pushdown in ORC (HIVE-4246) can be used to eliminate the stripes (thereby splits) that doesn't satisfy the predicate condition. This can greatly reduce unnecessary reads. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5107) Change hive's build to maven
[ https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817783#comment-13817783 ] Vaibhav Gumashta commented on HIVE-5107: I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests to itests? I think it might become confusing when adding new tests if the tests for a package are split into different locations. Change hive's build to maven Key: HIVE-5107 URL: https://issues.apache.org/jira/browse/HIVE-5107 Project: Hive Issue Type: Task Reporter: Edward Capriolo Assignee: Edward Capriolo I can not cope with hive's build infrastructure any more. I have started working on porting the project to maven. When I have some solid progess i will github the entire thing for review. Then we can talk about switching the project somehow. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5683) JDBC support for char
[ https://issues.apache.org/jira/browse/HIVE-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5683: - Attachment: HIVE-5683.2.patch rebase patch with trunk - patch v2. JDBC support for char - Key: HIVE-5683 URL: https://issues.apache.org/jira/browse/HIVE-5683 Project: Hive Issue Type: Bug Components: JDBC, Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5683.1.patch, HIVE-5683.2.patch Support char type in JDBC, including char length in result set metadata. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15375: HIVE-5683 JDBC support for char
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15375/ --- Review request for hive and Thejas Nair. Bugs: HIVE-5683 https://issues.apache.org/jira/browse/HIVE-5683 Repository: hive-git Description --- thrift/jdbc changes for char. Diffs - data/files/datatypes.txt 10daa1b itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java a270cc6 jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java b693e93 jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java 25faf48 jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java 79e8c8c jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java d612cf6 jdbc/src/java/org/apache/hive/jdbc/Utils.java 45de290 service/if/TCLIService.thrift 1f49445 service/src/gen/thrift/gen-cpp/TCLIService_constants.h 7471811 service/src/gen/thrift/gen-cpp/TCLIService_constants.cpp d085b30 service/src/gen/thrift/gen-cpp/TCLIService_types.h 490b393 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp a3fd46c service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIServiceConstants.java 7b4c576 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java 5d353f7 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java 15f2973 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeId.java be70a3a service/src/gen/thrift/gen-py/TCLIService/constants.py 589ce88 service/src/gen/thrift/gen-py/TCLIService/ttypes.py b286b05 service/src/gen/thrift/gen-rb/t_c_l_i_service_constants.rb 8c341c8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb c608364 service/src/java/org/apache/hive/service/cli/ColumnValue.java 62e221b service/src/java/org/apache/hive/service/cli/Type.java f414fca service/src/java/org/apache/hive/service/cli/TypeQualifiers.java 66a4b12 Diff: https://reviews.apache.org/r/15375/diff/ Testing --- Thanks, Jason Dere