[jira] [Commented] (HIVE-7183) Size of partColumnGrants should be checked in ObjectStore#removeRole()
[ https://issues.apache.org/jira/browse/HIVE-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028872#comment-14028872 ] Hive QA commented on HIVE-7183: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12648924/HIVE-7183.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 5535 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.conf.TestHiveConf.testConfProperties org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/442/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/442/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-442/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12648924 Size of partColumnGrants should be checked in ObjectStore#removeRole() -- Key: HIVE-7183 URL: https://issues.apache.org/jira/browse/HIVE-7183 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7183.patch Here is related code: {code} ListMPartitionColumnPrivilege partColumnGrants = listPrincipalAllPartitionColumnGrants( mRol.getRoleName(), PrincipalType.ROLE); if (tblColumnGrants.size() 0) { pm.deletePersistentAll(partColumnGrants); {code} Size of tblColumnGrants is currently checked. Size of partColumnGrants should be checked instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7213) COUNT(*) returns the count of the last inserted rows through INSERT INTO TABLE
[ https://issues.apache.org/jira/browse/HIVE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Moustafa Aboul Atta updated HIVE-7213: -- Priority: Major (was: Minor) COUNT(*) returns the count of the last inserted rows through INSERT INTO TABLE -- Key: HIVE-7213 URL: https://issues.apache.org/jira/browse/HIVE-7213 Project: Hive Issue Type: Bug Components: Query Processor, Statistics Affects Versions: 0.13.0 Environment: HDP 2.1 Windows Server 2012 64-bit Reporter: Moustafa Aboul Atta Running a query to count number of rows in a table through {{SELECT COUNT( * ) FROM t}} always returns the last number of rows added through the following statement: {{INSERT INTO TABLE t SELECT r FROM t2}} However, running {{SELECT * FROM t}} returns the expected results i.e. the old and newly added rows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7203) Optimize limit 0
[ https://issues.apache.org/jira/browse/HIVE-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028936#comment-14028936 ] Lefty Leverenz commented on HIVE-7203: -- No user doc, right? Optimize limit 0 Key: HIVE-7203 URL: https://issues.apache.org/jira/browse/HIVE-7203 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.14.0 Attachments: HIVE-7203.1.patch, HIVE-7203.patch Some tools generate queries with limit 0. Lets optimize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6455: - Labels: TODOC13 TODOC14 optimization (was: TODOC14 optimization) Scalable dynamic partitioning and bucketing optimization Key: HIVE-6455 URL: https://issues.apache.org/jira/browse/HIVE-6455 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: TODOC13, TODOC14, optimization Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.10.patch, HIVE-6455.10.patch, HIVE-6455.11.patch, HIVE-6455.12.patch, HIVE-6455.13.patch, HIVE-6455.13.patch, HIVE-6455.14.patch, HIVE-6455.15.patch, HIVE-6455.16.patch, HIVE-6455.17.patch, HIVE-6455.17.patch.txt, HIVE-6455.18.patch, HIVE-6455.19.patch, HIVE-6455.2.patch, HIVE-6455.20.patch, HIVE-6455.21.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch, HIVE-6455.9.patch, HIVE-6455.9.patch The current implementation of dynamic partition works by keeping at least one record writer open per dynamic partition directory. In case of bucketing there can be multispray file writers which further adds up to the number of open record writers. The record writers of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or compression buffers) open all the time to buffer up the rows and compress them before flushing it to disk. Since these buffers are maintained per column basis the amount of constant memory that will required at runtime increases as the number of partitions and number of columns per partition increases. This often leads to OutOfMemory (OOM) exception in mappers or reducers depending on the number of open record writers. Users often tune the JVM heapsize (runtime memory) to get over such OOM issues. With this optimization, the dynamic partition columns and bucketing columns (in case of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and bucketing columns are sorted, each reducers can keep only one record writer open at any time thereby reducing the memory pressure on the reducers. This optimization is highly scalable as the number of partition and number of columns per partition increases at the cost of sorting the columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6756) alter table set fileformat should set serde too
[ https://issues.apache.org/jira/browse/HIVE-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6756: - Labels: TODOC14 (was: ) alter table set fileformat should set serde too --- Key: HIVE-6756 URL: https://issues.apache.org/jira/browse/HIVE-6756 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Owen O'Malley Assignee: Chinna Rao Lalam Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-6756.1.patch, HIVE-6756.2.patch, HIVE-6756.3.patch, HIVE-6756.patch Currently doing alter table set fileformat doesn't change the serde. This is unexpected by customers because the serdes are largely file format specific. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7206) Duplicate declaration of build-helper-maven-plugin in root pom
[ https://issues.apache.org/jira/browse/HIVE-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028959#comment-14028959 ] Hive QA commented on HIVE-7206: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649845/HIVE-7206.1.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/443/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/443/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-443/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649845 Duplicate declaration of build-helper-maven-plugin in root pom -- Key: HIVE-7206 URL: https://issues.apache.org/jira/browse/HIVE-7206 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-7206.1.patch, HIVE-7206.patch Results in following warnings while building: [WARNING] Some problems were encountered while building the effective model for org.apache.hive:hive-it-custom-serde:jar:0.14.0-SNAPSHOT [WARNING] 'build.pluginManagement.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.codehaus.mojo:build-helper-maven-plugin @ org.apache.hive:hive:0.14.0-SNAPSHOT, pom.xml, line 638, column 17 [WARNING] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7188) sum(if()) returns wrong results with vectorization
[ https://issues.apache.org/jira/browse/HIVE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029123#comment-14029123 ] Hive QA commented on HIVE-7188: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649883/HIVE-7188.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/444/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/444/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-444/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649883 sum(if()) returns wrong results with vectorization -- Key: HIVE-7188 URL: https://issues.apache.org/jira/browse/HIVE-7188 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7188.1.patch, HIVE-7188.2.patch, hike-vector-sum-bug.tgz 1. The tgz file containing the setup is attached. 2. Run the following query select sum(if(is_returning=true and is_free=false,1,0)) as unpaid_returning from hike_error.ttr_day0; returns 0 rows with vectorization turned on whereas it return 131 rows with vectorization turned off. hive source insert.sql ; OK Time taken: 0.359 seconds OK Time taken: 0.015 seconds OK Time taken: 0.069 seconds OK Time taken: 0.176 seconds Loading data to table hike_error.ttr_day0 Table hike_error.ttr_day0 stats: [numFiles=1, numRows=0, totalSize=3581, rawDataSize=0] OK Time taken: 0.33 seconds hive select sum(if(is_returning=true and is_free=false,1,0)) as unpaid_returning from hike_error.ttr_day0; Query ID = hsubramaniyan_20140606134646_04790d3d-ca9a-427a-8cf9-3174536114ed Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Execution log at: /var/folders/r0/9x0wltgx2nv4m4b18m71z1y4gr/T//hsubramaniyan/hsubramaniyan_20140606134646_04790d3d-ca9a-427a-8cf9-3174536114ed.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2014-06-06 13:47:02,043 null map = 0%, reduce = 100% Ended Job = job_local773704964_0001 Execution completed successfully MapredLocal task succeeded OK 131 Time taken: 5.325 seconds, Fetched: 1 row(s) hive set hive.vectorized.execution.enabled=true; hive select sum(if(is_returning=true and is_free=false,1,0)) as unpaid_returning from hike_error.ttr_day0; Query ID = hsubramaniyan_20140606134747_1182c765-90ac-4a33-a8b1-760adca6bf38 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Execution log at: /var/folders/r0/9x0wltgx2nv4m4b18m71z1y4gr/T//hsubramaniyan/hsubramaniyan_20140606134747_1182c765-90ac-4a33-a8b1-760adca6bf38.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2014-06-06 13:47:18,604 null map = 0%, reduce = 100% Ended Job = job_local701415676_0001 Execution completed successfully MapredLocal task succeeded OK 0 Time taken: 5.52 seconds, Fetched: 1 row(s) hive explain select sum(if(is_returning=true and is_free=false,1,0)) as unpaid_returning from hike_error.ttr_day0; OK STAGE
Review Request 22513: HIVE-6928 : Beeline should not chop off describe extended results by default
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22513/ --- Review request for hive. Repository: hive Description --- The length of the row have more characters, showing the output in table format wont be looking good. When ever the row length is bigger than width, present that in vertical format (decide this in run time). Diffs - trunk/beeline/src/java/org/apache/hive/beeline/BeeLine.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/BufferedRows.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/Rows.java 1597407 Diff: https://reviews.apache.org/r/22513/diff/ Testing --- All unit tests are pass. Thanks, Chinna Lalam
[jira] [Commented] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029183#comment-14029183 ] Chinna Rao Lalam commented on HIVE-6928: Created review board entry. https://reviews.apache.org/r/22513/ Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22513: HIVE-6928 : Beeline should not chop off describe extended results by default
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22513/#review45495 --- trunk/beeline/src/java/org/apache/hive/beeline/BeeLine.java https://reviews.apache.org/r/22513/#comment80342 Per Hive coding style, please strip of all trailing spaces (shown in red). - Xuefu Zhang On June 12, 2014, 2:11 p.m., Chinna Lalam wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22513/ --- (Updated June 12, 2014, 2:11 p.m.) Review request for hive. Repository: hive Description --- The length of the row have more characters, showing the output in table format wont be looking good. When ever the row length is bigger than width, present that in vertical format (decide this in run time). Diffs - trunk/beeline/src/java/org/apache/hive/beeline/BeeLine.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/BufferedRows.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 1597407 trunk/beeline/src/java/org/apache/hive/beeline/Rows.java 1597407 Diff: https://reviews.apache.org/r/22513/diff/ Testing --- All unit tests are pass. Thanks, Chinna Lalam
[jira] [Commented] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029189#comment-14029189 ] Xuefu Zhang commented on HIVE-6928: --- +1, patch looks good. Minor comment on RB. Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7166) Vectorization with UDFs returns incorrect results
[ https://issues.apache.org/jira/browse/HIVE-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029293#comment-14029293 ] Hive QA commented on HIVE-7166: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12648574/HIVE-7166.2.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/445/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/445/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-445/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12648574 Vectorization with UDFs returns incorrect results - Key: HIVE-7166 URL: https://issues.apache.org/jira/browse/HIVE-7166 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.13.0 Environment: Hive 0.13 with Hadoop 2.4 on a 3 node cluster Reporter: Benjamin Bowman Assignee: Hari Sankar Sivarama Subramaniyan Priority: Minor Attachments: HIVE-7166.1.patch, HIVE-7166.2.patch Using BETWEEN, a custom UDF, and vectorized query execution yields incorrect query results. Example Query: SELECT column_1 FROM table_1 WHERE column_1 BETWEEN (UDF_1 - X) and UDF_1 The following test scenario will reproduce the problem: TEST UDF (SIMPLE FUNCTION THAT TAKES NO ARGUMENTS AND RETURNS 1): package com.test; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import java.lang.String; import java.lang.*; public class tenThousand extends UDF { private final LongWritable result = new LongWritable(); public LongWritable evaluate() { result.set(1); return result; } } TEST DATA (test.input): 1|CBCABC|12 2|DBCABC|13 3|EBCABC|14 4|ABCABC|15 5|BBCABC|16 6|CBCABC|17 CREATING ORC TABLE: 0: jdbc:hive2://server:10002/db create table testTabOrc (first bigint, second varchar(20), third int) partitioned by (range int) clustered by (first) sorted by (first) into 8 buckets stored as orc tblproperties (orc.compress = SNAPPY, orc.index = true); CREATE LOADING TABLE: 0: jdbc:hive2://server:10002/db create table loadingDir (first bigint, second varchar(20), third int) partitioned by (range int) row format delimited fields terminated by '|' stored as textfile; COPY IN DATA: [root@server]# hadoop fs -copyFromLocal /tmp/test.input /db/loading/. ORC DATA: [root@server]# beeline -u jdbc:hive2://server:10002/db -n root --hiveconf hive.exec.dynamic.partition.mode=nonstrict --hiveconf hive.enforce.sorting=true -e insert into table testTabOrc partition(range) select * from loadingDir; LOAD TEST FUNCTION: 0: jdbc:hive2://server:10002/db add jar /opt/hadoop/lib/testFunction.jar 0: jdbc:hive2://server:10002/db create temporary function ten_thousand as 'com.test.tenThousand'; TURN OFF VECTORIZATION: 0: jdbc:hive2://server:10002/db set hive.vectorized.execution.enabled=false; QUERY (RESULTS AS EXPECTED): 0: jdbc:hive2://server:10002/db select first from testTabOrc where first between ten_thousand()-1 and ten_thousand()-9995; ++ | first | ++ | 1 | | 2 | | 3 | ++ 3 rows selected (15.286 seconds) TURN ON VECTORIZATION: 0: jdbc:hive2://server:10002/db set hive.vectorized.execution.enabled=true; QUERY AGAIN (WRONG RESULTS): 0: jdbc:hive2://server:10002/db select first from testTabOrc where first between ten_thousand()-1 and ten_thousand()-9995; ++ | first | ++ ++ No rows selected (17.763 seconds) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set
[ https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029314#comment-14029314 ] Naveen Gangam commented on HIVE-7200: - Sorry about the formatting. Lemme retry this {code} beeline !connect jdbc:hive2://localhost:1 root password org.apache.hive.jdbc.HiveDriver Connecting to jdbc:hive2://localhost:1 Connected to: Apache Hive (version 0.12.0-cdh5.0.0) Driver: Hive JDBC (version 0.12.0-cdh5.0.0) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (19.806 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (1.26 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.679 seconds) 0: jdbc:hive2://localhost:1 !set showHeader true 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| +--+ | val | +--+ | T| | F| +--+ | val | +--+ | 0| | 1| +--+ 6 rows selected (0.817 seconds) {code} Beeline output displays column heading even if --showHeader=false is set Key: HIVE-7200 URL: https://issues.apache.org/jira/browse/HIVE-7200 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7200.1.patch A few minor/cosmetic issues with the beeline CLI. 1) Tool prints the column headers despite setting the --showHeader to false. This property only seems to affect the subsequent header information that gets printed based on the value of property headerInterval (default value is 100). 2) When showHeader is true headerInterval 0, the header after the first interval gets printed after headerInterval - 1 rows. The code seems to count the initial header as a row, if you will. 3) The table footer(the line that closes the table) does not get printed if the showHeader is false. I think the table should get closed irrespective of whether it prints the header or not. {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.998 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| +--+ | val | +--+ | f| | T| +--+ | val | +--+ | F| | 0| +--+ | val | +--+ | 1| +--+ 6 rows selected (0.691 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| 6 rows selected (1.728 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN
[ https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029325#comment-14029325 ] Edward Capriolo commented on HIVE-5857: --- {code} } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... {code} Can we remove this code? This bothers me. It is not self documenting all. Can we use if statements to determine when the file should be there and when it should not. Something like: if (job.hasNoReduceWork()){ retur null; } else { throw RuntimeException(work should be found but was not + expectedPathToFile); Reduce tasks do not work in uber mode in YARN - Key: HIVE-5857 URL: https://issues.apache.org/jira/browse/HIVE-5857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: Adam Kawa Assignee: Adam Kawa Priority: Critical Labels: plan, uber-jar, uberization, yarn Fix For: 0.13.0 Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, HIVE-5857.3.patch A Hive query fails when it tries to run a reduce task in uber mode in YARN. The NullPointerException is thrown in the ExecReducer.configure method, because the plan file (reduce.xml) for a reduce task is not found. The Utilities.getBaseWork method is expected to return BaseWork object, but it returns NULL due to FileNotFoundException. {code} // org.apache.hadoop.hive.ql.exec.Utilities public static BaseWork getBaseWork(Configuration conf, String name) { ... try { ... if (gWork == null) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } InputStream in = new FileInputStream(localPath.toUri().getPath()); BaseWork ret = deserializePlan(in); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... } {code} It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method returns true, because immediately before running a reduce task, org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to local mode (mapreduce.framework.name is changed from yarn to local). On the other hand map tasks run successfully, because its configuration is not changed and still remains yarn. {code} // org.apache.hadoop.mapred.LocalContainerLauncher private void runSubtask(..) { ... conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME); conf.set(MRConfig.MASTER_ADDRESS, local); // bypass shuffle ReduceTask reduce = (ReduceTask)task; reduce.setConf(conf); reduce.run(conf, umbilical); } {code} A super quick fix could just an additional if-branch, where we check if we run a reduce task in uber mode, and then look for a plan file in a different location. *Java stacktrace* {code} 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:427) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.runSubtask(LocalContainerLauncher.java:340) at org.apache.hadoop.mapred.LocalContainerLauncher$SubtaskRunner.run(LocalContainerLauncher.java:225) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 7 more
[jira] [Commented] (HIVE-7221) Beeline buffers the entire output file in memory before writing to stdout
[ https://issues.apache.org/jira/browse/HIVE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029401#comment-14029401 ] Vaibhav Gumashta commented on HIVE-7221: Actually there is an option (which wasn't documented earlier) which lets you print output incrementally: beeline --incremental=true. I think we should have that true by default. Will create another jira for that. Beeline buffers the entire output file in memory before writing to stdout - Key: HIVE-7221 URL: https://issues.apache.org/jira/browse/HIVE-7221 Project: Hive Issue Type: Bug Components: Clients, JDBC Reporter: Vaibhav Gumashta Fix For: 0.13.0 It seems beeline does not write to stdout till it reads the entire output relation. This can cause OOM and should be fixed. Beeline should only buffer a small number of row batches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7224) Set incremental printing to true by default in Beeline
Vaibhav Gumashta created HIVE-7224: -- Summary: Set incremental printing to true by default in Beeline Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 See HIVE-7221. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7221) Beeline buffers the entire output file in memory before writing to stdout
[ https://issues.apache.org/jira/browse/HIVE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta resolved HIVE-7221. Resolution: Invalid Beeline buffers the entire output file in memory before writing to stdout - Key: HIVE-7221 URL: https://issues.apache.org/jira/browse/HIVE-7221 Project: Hive Issue Type: Bug Components: Clients, JDBC Reporter: Vaibhav Gumashta Fix For: 0.13.0 It seems beeline does not write to stdout till it reads the entire output relation. This can cause OOM and should be fixed. Beeline should only buffer a small number of row batches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7208) move SearchArgument interface into serde package
[ https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029407#comment-14029407 ] Owen O'Malley commented on HIVE-7208: - I'm fine with moving it to the serde jar for now as long as we keep the package name. move SearchArgument interface into serde package Key: HIVE-7208 URL: https://issues.apache.org/jira/browse/HIVE-7208 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-7208.patch For usage in alternative input formats/serdes, it might be useful to move SearchArgument class to a place that is not in ql (because it's hard to depend on ql). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-7224: --- Attachment: HIVE-7224.1.patch cc [~xuefuz] [~thejas] Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-7224: --- Status: Patch Available (was: Open) Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029439#comment-14029439 ] Rohini Palaniswamy commented on HIVE-7215: -- What happens if there are only few nulls for the column in the row group ? That will be the case most of the time. Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6540) Support Multi Column Stats
[ https://issues.apache.org/jira/browse/HIVE-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029443#comment-14029443 ] Alex Nastetsky commented on HIVE-6540: -- I hope this is included in the next version. It would cut the time in half needed to create and validate data transformation by combining the steps needed to create the new table and gather statistics on it into one step. Support Multi Column Stats -- Key: HIVE-6540 URL: https://issues.apache.org/jira/browse/HIVE-6540 Project: Hive Issue Type: Improvement Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran For Joins involving compound predicates, multi column stats can be used to accurately compute the NDV. Objective is to compute NDV of more than one columns. Compute NDV of (x,y,z). R1 IJ R2 on R1.x=R2.x and R1.y=R2.y and R1.z=R2.z can use max(NDV(R1.x, R1.y, R1.z), NDV(R2.x, R2.y, R2.z)) for Join NDV ( hence selectivity). http://www.oracle-base.com/articles/11g/statistics-collection-enhancements-11gr1.php#multi_column_statistics http://blogs.msdn.com/b/ianjo/archive/2005/11/10/491548.aspx http://developer.teradata.com/database/articles/removing-multi-column-statistics-a-process-for-identification-of-redundant-statist -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set
[ https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029462#comment-14029462 ] Xuefu Zhang commented on HIVE-7200: --- Thanks for reposting. It looks good. One minor thought: when header is off, should we print: {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | t| | f| | T| | F| | 0| | 1| +--+ {code} instead of {code} 0: jdbc:hive2://localhost:1 select * from stringvals; | t| | f| | T| | F| | 0| | 1| +--+ {code} Beeline output displays column heading even if --showHeader=false is set Key: HIVE-7200 URL: https://issues.apache.org/jira/browse/HIVE-7200 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7200.1.patch A few minor/cosmetic issues with the beeline CLI. 1) Tool prints the column headers despite setting the --showHeader to false. This property only seems to affect the subsequent header information that gets printed based on the value of property headerInterval (default value is 100). 2) When showHeader is true headerInterval 0, the header after the first interval gets printed after headerInterval - 1 rows. The code seems to count the initial header as a row, if you will. 3) The table footer(the line that closes the table) does not get printed if the showHeader is false. I think the table should get closed irrespective of whether it prints the header or not. {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.998 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| +--+ | val | +--+ | f| | T| +--+ | val | +--+ | F| | 0| +--+ | val | +--+ | 1| +--+ 6 rows selected (0.691 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| 6 rows selected (1.728 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7225) Unclosed Statement's in TxnHandler
Ted Yu created HIVE-7225: Summary: Unclosed Statement's in TxnHandler Key: HIVE-7225 URL: https://issues.apache.org/jira/browse/HIVE-7225 Project: Hive Issue Type: Bug Reporter: Ted Yu There're several methods in TxnHandler where Statement (local to the method) is not closed upon return. Here're a few examples: In compact(): {code} stmt.executeUpdate(s); LOG.debug(Going to commit); dbConn.commit(); {code} In showCompact(): {code} Statement stmt = dbConn.createStatement(); String s = select cq_database, cq_table, cq_partition, cq_state, cq_type, cq_worker_id, + cq_start, cq_run_as from COMPACTION_QUEUE; LOG.debug(Going to execute query + s + ); ResultSet rs = stmt.executeQuery(s); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7022) Replace BinaryWritable with BytesWritable in Parquet serde
[ https://issues.apache.org/jira/browse/HIVE-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7022: -- Description: Currently ParquetHiveSerde uses BinaryWritable to enclose bytes read from Parquet data. However, existing Hadoop class, BytesWritable, already does that, and BinaryWritable offers no advantage. On the other hand, BinaryWritable has a confusing getString() method, which, if misused, can cause unexpected result. The proposal here is to replace it with Hadoop BytesWritable. The issue was identified in HIVE-6367, serving as a follow-up JIRA. was: Currently ParquetHiveSerde uses BinaryWritable to enclose bytes read from Parquet data. However, existing Hadoop class, BytesWritable, already does that, and BinaryWritable offers no advantage. On the other hand, BinaryWritable has a confusing getString() method, which, in misused, can cause unexpected result. The proposal here is to replace it with Hadoop BytesWritable. The issue was identified in HIVE-6367, serving as a follow-up JIRA. Replace BinaryWritable with BytesWritable in Parquet serde -- Key: HIVE-7022 URL: https://issues.apache.org/jira/browse/HIVE-7022 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-7022.patch Currently ParquetHiveSerde uses BinaryWritable to enclose bytes read from Parquet data. However, existing Hadoop class, BytesWritable, already does that, and BinaryWritable offers no advantage. On the other hand, BinaryWritable has a confusing getString() method, which, if misused, can cause unexpected result. The proposal here is to replace it with Hadoop BytesWritable. The issue was identified in HIVE-6367, serving as a follow-up JIRA. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7022) Replace BinaryWritable with BytesWritable in Parquet serde
[ https://issues.apache.org/jira/browse/HIVE-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7022: -- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks to Brock for the review. Replace BinaryWritable with BytesWritable in Parquet serde -- Key: HIVE-7022 URL: https://issues.apache.org/jira/browse/HIVE-7022 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.14.0 Attachments: HIVE-7022.patch Currently ParquetHiveSerde uses BinaryWritable to enclose bytes read from Parquet data. However, existing Hadoop class, BytesWritable, already does that, and BinaryWritable offers no advantage. On the other hand, BinaryWritable has a confusing getString() method, which, if misused, can cause unexpected result. The proposal here is to replace it with Hadoop BytesWritable. The issue was identified in HIVE-6367, serving as a follow-up JIRA. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029497#comment-14029497 ] Nick Dimiduk commented on HIVE-6584: Thanks for the insightful comments, [~tenggyut]. bq. 1. HBaseStorageHandler.getInputFormatClass(): i am afraid that the returned inputformat will always be HiveHBaseTabelInputFormat (at least according to my test) I was afraid of this in my initial design thinking, but my experiments proved otherwise. Can you elaborate on your tests? I'd like to reproduce this issue if I'm able. bq. 2. in the method HBaseStorageHandler.preCreateTable, hive will check whether the HBase table exist or not, regardless the external table that hive gonna create is based on actual table or a snapshot. I haven't yet looked at the use-case of consuming a snapshot for which there is no table in HBase. I planned to approach this kind of feature in follow-on work; the goal here is to get jus the basics working. bq. 3, 4 [snip] These are both true. bq. So I suggest adding a subclass of HBaseStorageHandler(and other necessary classes) ,say HBaseSnapshotStorageHandler, to deal with the hbase snapshot situation. A goal of this patch is to be able to query snapshots created from online tables already registered with Hive using the HBaseStorageHandler. Implementing HBaseSnapshotStorageHandler requires a separate table registration for the snapshot. I think that's undesirable. Regarding the hbase snapshot situation, let's make it better on the HBase side. What do you recommend? Add HiveHBaseTableSnapshotInputFormat - Key: HIVE-6584 URL: https://issues.apache.org/jira/browse/HIVE-6584 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.14.0 Attachments: HIVE-6584.0.patch, HIVE-6584.1.patch, HIVE-6584.2.patch, HIVE-6584.3.patch HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a {{mapred}} implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7195) Improve Metastore performance
[ https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029525#comment-14029525 ] Sergey Shelukhin commented on HIVE-7195: Yeah that's what all recently added metastore APIs do Improve Metastore performance - Key: HIVE-7195 URL: https://issues.apache.org/jira/browse/HIVE-7195 Project: Hive Issue Type: Improvement Reporter: Brock Noland Priority: Critical Even with direct SQL, which significantly improves MS performance, some operations take a considerable amount of time, when there are many partitions on table. Specifically I believe the issue: * When a client gets all partitions we do not send them an iterator, we create a collection of all data and then pass the object over the network in total * Operations which require looking up data on the NN can still be slow since there is no cache of information and it's done in a serial fashion * Perhaps a tangent, but our client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. We should use deadlines, i.e. pass the timeout to the server so it can calculate that the client has expired. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7020) NPE when there is no plan file.
[ https://issues.apache.org/jira/browse/HIVE-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029529#comment-14029529 ] Jason Dere commented on HIVE-7020: -- Hi [~azuryy], just curious if you had any more information about this one. Was this with HiveServer2 or CLIDriver? Was YARN uberized mode enabled (like HIVE-5857)? NPE when there is no plan file. --- Key: HIVE-7020 URL: https://issues.apache.org/jira/browse/HIVE-7020 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.0 Reporter: Fengdong Yu Hive throws NPE when there is no plan file. Exception message: {code} 2014-05-06 18:03:17,749 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: file:/tmp/test/hive_2014-05-06_18-02-58_539_232619201891510265-1/-mr-10001/8cf1c965-b173-4482-a016-4a51a74b9324/map.xml 2014-05-06 18:03:17,750 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) {code} I looked through the code, ql/exec/Utilities.java: {code} private static BaseWork getBaseWork(Configuration conf, String name) { } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.info(No plan file found: +path); return null; } {code} this code was called by HiveInputFormat.java: {code} protected void init(JobConf job) { mrwork = Utilities.getMapWork(job); pathToPartitionInfo = mrwork.getPathToPartitionInfo(); } {code} mrwork is null, then NPE here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029533#comment-14029533 ] Prasanth J commented on HIVE-7215: -- It reads the entire row group. However, ORC reads the row group even in the opposite case, the case where there are no nulls in column within the row group. This can be improved by having boolean flag/#nulls within row group index. Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7226) Windowing Streaming mode causes NPE for empty partitions
Harish Butani created HIVE-7226: --- Summary: Windowing Streaming mode causes NPE for empty partitions Key: HIVE-7226 URL: https://issues.apache.org/jira/browse/HIVE-7226 Project: Hive Issue Type: Bug Reporter: Harish Butani Change in HIVE-7062 doesn't handle empty partitions properly. StreamingState is not correctly initialized for empty partition -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-6928: --- Attachment: HIVE-6928.2.patch Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.2.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6928) Beeline should not chop off describe extended results by default
[ https://issues.apache.org/jira/browse/HIVE-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029550#comment-14029550 ] Chinna Rao Lalam commented on HIVE-6928: Reworked the patch. Thanks for the review Xuefu Zhang. Beeline should not chop off describe extended results by default -- Key: HIVE-6928 URL: https://issues.apache.org/jira/browse/HIVE-6928 Project: Hive Issue Type: Bug Components: CLI Reporter: Szehon Ho Assignee: Chinna Rao Lalam Attachments: HIVE-6928.1.patch, HIVE-6928.2.patch, HIVE-6928.patch By default, beeline truncates long results based on the console width like: {code} +-+--+ | col_name | | +-+--+ | pat_id | string | | score | float | | acutes | float | | | | | Detailed Table Information | Table(tableName:refills, dbName:default, owner:hdadmin, createTime:1393882396, lastAccessTime:0, retention:0, sd:Sto | +-+--+ 5 rows selected (0.4 seconds) {code} This can be changed by !outputformat, but the default should behave better to give a better experience to the first-time beeline user. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22174: HIVE-6394 Implement Timestmap in ParquetSerde
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22174/ --- (Updated June 12, 2014, 6:23 p.m.) Review request for hive, Brock Noland, justin coffey, and Xuefu Zhang. Changes --- Rebase Bugs: HIVE-6394 https://issues.apache.org/jira/browse/HIVE-6394 Repository: hive-git Description --- This uses the Jodd library to convert java.sql.Timestamp type used by Hive into the {julian-day:nanos} format expected by parquet, and vice-versa. Diffs (updated) - data/files/parquet_types.txt 0be390b pom.xml 2b91846 ql/pom.xml 13c477a ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 218c007 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 29f7e11 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java 57161d8 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 4cad1cb ql/src/java/org/apache/hadoop/hive/ql/io/parquet/utils/NanoTimeUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 6169353 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java PRE-CREATION ql/src/test/queries/clientnegative/parquet_timestamp.q 4ef36fa ql/src/test/queries/clientpositive/parquet_types.q 5d6333c ql/src/test/results/clientpositive/parquet_types.q.out c23f7f1 Diff: https://reviews.apache.org/r/22174/diff/ Testing --- Unit tests the new libraries, and also added timestamp data in the parquet_types q-test. Thanks, Szehon Ho
[jira] [Updated] (HIVE-6394) Implement Timestmap in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6394: Attachment: HIVE-6394.7.patch Rebase after Xuefu's commit Implement Timestmap in ParquetSerde --- Key: HIVE-6394 URL: https://issues.apache.org/jira/browse/HIVE-6394 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Jarek Jarcec Cecho Assignee: Szehon Ho Labels: Parquet Attachments: HIVE-6394.2.patch, HIVE-6394.3.patch, HIVE-6394.4.patch, HIVE-6394.5.patch, HIVE-6394.6.patch, HIVE-6394.6.patch, HIVE-6394.7.patch, HIVE-6394.patch This JIRA is to implement timestamp support in Parquet SerDe. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029572#comment-14029572 ] Hive QA commented on HIVE-6584: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649918/HIVE-6584.3.patch {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_external_table_ppd org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/446/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/446/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-446/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649918 Add HiveHBaseTableSnapshotInputFormat - Key: HIVE-6584 URL: https://issues.apache.org/jira/browse/HIVE-6584 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.14.0 Attachments: HIVE-6584.0.patch, HIVE-6584.1.patch, HIVE-6584.2.patch, HIVE-6584.3.patch HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a {{mapred}} implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029592#comment-14029592 ] Rohini Palaniswamy commented on HIVE-7215: -- bq. However, ORC reads the row group even in the opposite case, the case where there are no nulls in column within the row group. That is what my concern was. This amounts to no predicate pushdown as it is going to read all row groups irrespective of whether there are nulls. So will leave this jira open to address that. bq. if col is completely null in a row group, ORC predicate pushdown evaluates to true (based on null statistics) and reads the row group. If there is a non null check, will the row group which has all nulls be ignored? Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-7224: --- Description: See HIVE-7221. By default beeline tries to buffer the entire output relation before printing it on stdout. This can cause OOM when the output relation is large. However, beeline has the option of incremental prints. We should keep that as the default. was:See HIVE-7221. Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. By default beeline tries to buffer the entire output relation before printing it on stdout. This can cause OOM when the output relation is large. However, beeline has the option of incremental prints. We should keep that as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7226) Windowing Streaming mode causes NPE for empty partitions
[ https://issues.apache.org/jira/browse/HIVE-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-7226: Attachment: HIVE-7226.1.patch Windowing Streaming mode causes NPE for empty partitions Key: HIVE-7226 URL: https://issues.apache.org/jira/browse/HIVE-7226 Project: Hive Issue Type: Bug Reporter: Harish Butani Attachments: HIVE-7226.1.patch Change in HIVE-7062 doesn't handle empty partitions properly. StreamingState is not correctly initialized for empty partition -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6892) Permission inheritance issues
[ https://issues.apache.org/jira/browse/HIVE-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6892: Labels: TODOC14 (was: ) Permission inheritance issues - Key: HIVE-6892 URL: https://issues.apache.org/jira/browse/HIVE-6892 Project: Hive Issue Type: Bug Components: Security Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC14 *HDFS Background* * When a file or directory is created, its owner is the user identity of the client process, and its group is inherited from parent (the BSD rule). Permissions are taken from default umask. Extended Acl's are taken from parent unless they are set explicitly. *Goals* To reduce need to set fine-grain file security props after every operation, users may want the following Hive warehouse file/dir to auto-inherit security properties from their directory parents: * Directories created by new table/partition/bucket * Files added to tables via load/insert * Table directories exported/imported (open question of whether exported table inheriting perm from new parent needs another flag) What may be inherited: * Basic file permission * Groups (already done by HDFS for new directories) * Extended ACL's (already done by HDFS for new directories) *Behavior* * When hive.warehouse.subdir.inherit.perms flag is enabled in Hive, Hive will try to do all above inheritances. In the future, we can add more flags for more finer-grained control. * Failure by Hive to inherit will not cause operation to fail. Rule of thumb of when security-prop inheritance will happen is the following: ** To run chmod, a user must be the owner of the file, or else a super-user. ** To run chgrp, a user must be the owner of files, or else a super-user. ** Hence, user that hive runs as (either 'hive' or the logged-in user in case of impersonation), must be super-user or owner of the file whose security properties are going to be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029632#comment-14029632 ] Prasanth J commented on HIVE-7215: -- bq. If there is a non null check, will the row group which has all nulls be ignored? Yes. Thats correct. HIVE-4639 addresses the improvement that I mentioned (having boolean flag within index). Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7226) Windowing Streaming mode causes NPE for empty partitions
[ https://issues.apache.org/jira/browse/HIVE-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029646#comment-14029646 ] Ashutosh Chauhan commented on HIVE-7226: +1 Windowing Streaming mode causes NPE for empty partitions Key: HIVE-7226 URL: https://issues.apache.org/jira/browse/HIVE-7226 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7226.1.patch Change in HIVE-7062 doesn't handle empty partitions properly. StreamingState is not correctly initialized for empty partition -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7226) Windowing Streaming mode causes NPE for empty partitions
[ https://issues.apache.org/jira/browse/HIVE-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7226: --- Assignee: Harish Butani Status: Patch Available (was: Open) Windowing Streaming mode causes NPE for empty partitions Key: HIVE-7226 URL: https://issues.apache.org/jira/browse/HIVE-7226 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7226.1.patch Change in HIVE-7062 doesn't handle empty partitions properly. StreamingState is not correctly initialized for empty partition -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Documentation Policy
Thank you guys! This is great work. On Wed, Jun 11, 2014 at 6:20 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Going through the issues, I think overall Lefty did an awesome job catching and documenting most of them in time. Following are some of the 0.13 and 0.14 ones which I found which either do not have documentation or have outdated one and probably need one to be consumeable. Contributors, feel free to remove the label if you disagree. *TODOC13:* https://issues.apache.org/jira/browse/HIVE-6827?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC13%20AND%20status%20in%20(Resolved%2C%20Closed) *TODOC14:* https://issues.apache.org/jira/browse/HIVE-6999?jql=project%20%3D%20HIVE%20AND%20labels%20%3D%20TODOC14%20AND%20status%20in%20(Resolved%2C%20Closed) I'll continue digging through the queue going backwards to 0.12 and 0.11 and see if I find similar stuff there as well. On Wed, Jun 11, 2014 at 10:36 AM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Feel free to label such jiras with this keyword and ask the contributors for more information if you need any. Cool. I'll start chugging through the queue today adding labels as apt. On Tue, Jun 10, 2014 at 9:45 PM, Thejas Nair the...@hortonworks.com wrote: Shall we lump 0.13.0 and 0.13.1 doc tasks as TODOC13? Sounds good to me. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Swarnim -- Swarnim
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-6938: --- Attachment: HIVE-6938.3.patch Very sorry for not reviewing this... I am re-uploading the patch to see the current result. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, HIVE-6938.3.patch, HIVE-6938.3.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029709#comment-14029709 ] Brock Noland commented on HIVE-6938: I am +1 pending tests Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, HIVE-6938.3.patch, HIVE-6938.3.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set
[ https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029722#comment-14029722 ] Naveen Gangam commented on HIVE-7200: - It makes sense. As a byproduct, unless we go out of the way to avoid this, when a query results in ZERO rows, we will see something like this (IMHO this is more readable than the current output) +--+ +--+ instead of +--+ Will post full results in the next comment. Beeline output displays column heading even if --showHeader=false is set Key: HIVE-7200 URL: https://issues.apache.org/jira/browse/HIVE-7200 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7200.1.patch, HIVE-7200.2.patch A few minor/cosmetic issues with the beeline CLI. 1) Tool prints the column headers despite setting the --showHeader to false. This property only seems to affect the subsequent header information that gets printed based on the value of property headerInterval (default value is 100). 2) When showHeader is true headerInterval 0, the header after the first interval gets printed after headerInterval - 1 rows. The code seems to count the initial header as a row, if you will. 3) The table footer(the line that closes the table) does not get printed if the showHeader is false. I think the table should get closed irrespective of whether it prints the header or not. {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.998 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| +--+ | val | +--+ | f| | T| +--+ | val | +--+ | F| | 0| +--+ | val | +--+ | 1| +--+ 6 rows selected (0.691 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| 6 rows selected (1.728 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set
[ https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029727#comment-14029727 ] Naveen Gangam commented on HIVE-7200: - {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.729 seconds) 0: jdbc:hive2://localhost:1 select * from employees; +---+-+---+-+--+--++ | name | salary | subordinates | deductions | address | country | state | +---+-+---+-+--+--++ +---+-+---+-+--+--++ No rows selected (2 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (0.882 seconds) 0: jdbc:hive2://localhost:1 select * from employees; +---+-+---+-+--+--++ +---+-+---+-+--+--++ No rows selected (1.914 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (1.923 seconds) 0: jdbc:hive2://localhost:1 select * from employees; +---+-+---+-+--+--++ +---+-+---+-+--+--++ No rows selected (6.866 seconds) 0: jdbc:hive2://localhost:1 !set showHeader true 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| +--+ | val | +--+ | T| | F| +--+ | val | +--+ | 0| | 1| +--+ 6 rows selected (2.447 seconds) 0: jdbc:hive2://localhost:1 select * from employees; +---+-+---+-+--+--++ | name | salary | subordinates | deductions | address | country | state | +---+-+---+-+--+--++ +---+-+---+-+--+--++ No rows selected (1.509 seconds) {code} Beeline output displays column heading even if --showHeader=false is set Key: HIVE-7200 URL: https://issues.apache.org/jira/browse/HIVE-7200 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7200.1.patch, HIVE-7200.2.patch A few minor/cosmetic issues with the beeline CLI. 1) Tool prints the column headers despite setting the --showHeader to false. This property only seems to affect the subsequent header information that gets printed based on the value of property headerInterval (default value is 100). 2) When showHeader is true headerInterval 0, the header after the first interval gets printed after headerInterval - 1 rows. The code seems to count the initial header as a row, if you will. 3) The table footer(the line that closes the table) does not get printed if the showHeader is false. I think the table should get closed irrespective of whether it prints the header or not. {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.998 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| +--+ | val | +--+ | f| | T| +--+ | val | +--+ | F| | 0| +--+ | val | +--+ | 1| +--+ 6 rows selected (0.691 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| 6 rows selected (1.728 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7200) Beeline output displays column heading even if --showHeader=false is set
[ https://issues.apache.org/jira/browse/HIVE-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-7200: Attachment: HIVE-7200.2.patch Beeline output displays column heading even if --showHeader=false is set Key: HIVE-7200 URL: https://issues.apache.org/jira/browse/HIVE-7200 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Minor Fix For: 0.14.0 Attachments: HIVE-7200.1.patch, HIVE-7200.2.patch A few minor/cosmetic issues with the beeline CLI. 1) Tool prints the column headers despite setting the --showHeader to false. This property only seems to affect the subsequent header information that gets printed based on the value of property headerInterval (default value is 100). 2) When showHeader is true headerInterval 0, the header after the first interval gets printed after headerInterval - 1 rows. The code seems to count the initial header as a row, if you will. 3) The table footer(the line that closes the table) does not get printed if the showHeader is false. I think the table should get closed irrespective of whether it prints the header or not. {code} 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| +--+ 6 rows selected (3.998 seconds) 0: jdbc:hive2://localhost:1 !set headerInterval 2 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| +--+ | val | +--+ | f| | T| +--+ | val | +--+ | F| | 0| +--+ | val | +--+ | 1| +--+ 6 rows selected (0.691 seconds) 0: jdbc:hive2://localhost:1 !set showHeader false 0: jdbc:hive2://localhost:1 select * from stringvals; +--+ | val | +--+ | t| | f| | T| | F| | 0| | 1| 6 rows selected (1.728 seconds) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22329: HIVE-7190. WebHCat launcher task failure can cause two concurent user jobs to run
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22329/#review45536 --- hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonUtils.java https://reviews.apache.org/r/22329/#comment80379 Is there a reason org.apache.hadoop.util.ClassUtil.findContainingJar(Class? clazz) won't work? - Eugene Koifman On June 12, 2014, 12:04 a.m., Ivan Mitic wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22329/ --- (Updated June 12, 2014, 12:04 a.m.) Review request for hive. Repository: hive-git Description --- Approach in the patch is similar to what Oozie does to handle this situation. Specifically, all child map jobs get tagged with the launcher MR job id. On launcher task restart, launcher queries RM for the list of jobs that have the tag and kills them. After that it moves on to start the same child job again. Again, similarly to what Oozie does, a new templeton.job.launch.time property is introduced that captures the launcher job submit timestamp and later used to reduce the search window when RM is queried. To validate the patch, you will need to add webhcat shim jars to templeton.libjars as now webhcat launcher also has a dependency on hadoop shims. I have noticed that in case of the SqoopDelegator webhcat currently does not set the MR delegation token when optionsFile flag is used. This also creates the problem in this scenario. This looks like something that should be handled via a separate Jira. Diffs - hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java 23b1c4f hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarDelegator.java 41b1dc5 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LauncherDelegator.java 04a5c6f hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigDelegator.java 04e061d hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SqoopDelegator.java adcd917 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java a6355a6 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java 556ee62 hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonUtils.java fff4b68 hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/tool/TestTempletonUtils.java 8b46d38 shims/0.20S/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java d3552c1 shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java 5a728b2 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 299e918 Diff: https://reviews.apache.org/r/22329/diff/ Testing --- I have validated that MR, Pig and Hive jobs do get tagged appropriately. I have also validated that previous child jobs do get killed on RM failover/task failure. Thanks, Ivan Mitic
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7209: Status: Patch Available (was: Open) allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7209.1.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22033: HIVE-7094: Separate static and dynamic partitioning implementations from FileRecordWriterContainer.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22033/#review45537 --- hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicFileRecordWriterContainer.java https://reviews.apache.org/r/22033/#comment80380 Please make it clear in the comment that dynamic refers to dynamic partitions. hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicFileRecordWriterContainer.java https://reviews.apache.org/r/22033/#comment80381 Is this the result of automated formatting or something that you're doing by hand? hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticFileRecordWriterContainer.java https://reviews.apache.org/r/22033/#comment80382 Please make it clear in the comment that static refers to static partitions. Also, does it make sense to change the name to StaticPartitionFileRecordWriterContainer? Extremely verbose but it gets the point across and avoids confusion. - Carl Steinbach On May 29, 2014, 7:33 p.m., David Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22033/ --- (Updated May 29, 2014, 7:33 p.m.) Review request for hive. Bugs: HIVE-7094 https://issues.apache.org/jira/browse/HIVE-7094 Repository: hive-git Description --- HIVE-7093: Separate static and dynamic partitioning implementations from FileRecordWriterContainer. Diffs - hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicFileRecordWriterContainer.java PRE-CREATION hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java e9ca263abade20b7423ad98695807a60ab957ead hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java b55a05528d5a4eed114b5628697cf5a60f6c6cbc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticFileRecordWriterContainer.java PRE-CREATION Diff: https://reviews.apache.org/r/22033/diff/ Testing --- Thanks, David Chen
[jira] [Updated] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7094: - Status: Open (was: Patch Available) [~davidzchen]: I left some comments on rb. Thanks. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029847#comment-14029847 ] Hive QA commented on HIVE-7065: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649923/HIVE-7065.2.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/447/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/447/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-447/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649923 Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7209: Attachment: HIVE-7209.2.patch HIVE-7209.2.patch - Addressing Ashutosh's suggestion of avoiding an additional interface. allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-7105: - Assignee: Gopal V (was: Jitendra Nath Pandey) Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029851#comment-14029851 ] Eugene Koifman commented on HIVE-7065: -- none of the failed tests are related to WebHCat Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
[ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029867#comment-14029867 ] Jayesh commented on HIVE-7100: -- Proposal 2: we keep HIVE-6469 (hive.warehouse.data.skipTrash=true/false) and introduce new hive.warehouse.skipTrash.control=client/admin which enables client to override default or admin setting for hive.warehouse.data.skipTrash if set to client and vice versa if set to admin. Please let us know what do you think? suggestion? Thanks Jay Users of hive should be able to specify skipTrash when dropping tables. --- Key: HIVE-7100 URL: https://issues.apache.org/jira/browse/HIVE-7100 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Ravi Prakash Assignee: Jayesh Attachments: HIVE-7100.patch Users of our clusters are often running up against their quota limits because of Hive tables. When they drop tables, they have to then manually delete the files from HDFS using skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly when dropping tables. We should also be able to provide this functionality without polluting SQL syntax. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7105: -- Attachment: HIVE-7105.2.patch Rebased to trunk and with the additional changes to RowObjectInspectors. This still respects tagging, but it might be almost impossible to tag vectorized row batches on the operator side. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case
[ https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankit Thapar updated HIVE-7201: Attachment: HIVE-7201-2.patch This is the correct patch. Previous was re based to trunk. This is the correct patch, re based to latest branch-0.13. Fix TestHiveConf#testConfProperties test case - Key: HIVE-7201 URL: https://issues.apache.org/jira/browse/HIVE-7201 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Pankit Thapar Priority: Minor Attachments: HIVE-7201-1.patch, HIVE-7201-2.patch, HIVE-7201.patch CHANGE 1: TEST CASE : The intention of TestHiveConf#testConfProperties() is to test the HiveConf properties being set in the priority as expected. Each HiveConf object is initialized as follows: 1) Hadoop configuration properties are applied. 2) ConfVar properties with non-null values are overlayed. 3) hive-site.xml properties are overlayed. ISSUE : The mapreduce related configurations are loaded by JobConf and not Configuration. The current test tries to get the configuration properties like : HADOOPNUMREDUCERS (mapred.job.reduces) from Configuration class. But these mapreduce related properties are loaded by JobConf class from mapred-default.xml. DETAILS : LINE 63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, 1); --fails Because, private void checkHadoopConf(String name, String expectedHadoopVal) { Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); Second parameter is null, since its the JobConf class and not the Configuration class that initializes mapred-default values. } Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call like this (in static block): public class JobConf extends Configuration { private static final Log LOG = LogFactory.getLog(JobConf.class); static{ ConfigUtil.loadResources(); -- loads mapreduce related resources (mapreduce-default.xml) } . } Please note, the test case assertion works fine if HiveConf() constructor is called before this assertion since, HiveConf() triggers JobConf() which basically sets the default values of the properties pertaining to mapreduce. This is why, there won't be any failures if testHiveSitePath() was run before testConfProperties() as that would load mapreduce properties into config properties. FIX: Instead of using a Configuration object, we can use the JobConf object to get the default values used by hadoop/mapreduce. CHANGE 2: In TestHiveConf#testHiveSitePath(), a call to static method getHiveSiteLocation() should be called statically instead of using an object. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7204) Use NULL vertex location hint for Prewarm DAG vertices
[ https://issues.apache.org/jira/browse/HIVE-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029928#comment-14029928 ] Gunther Hagleitner commented on HIVE-7204: -- +1 Use NULL vertex location hint for Prewarm DAG vertices -- Key: HIVE-7204 URL: https://issues.apache.org/jira/browse/HIVE-7204 Project: Hive Issue Type: Sub-task Components: Tez Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-7204.1.patch The current 0.5.x branch of Tez added extra preconditions which check for parallelism settings to match between the number of containers and the vertex location hints. {code} Caused by: org.apache.hadoop.ipc.RemoteException(java.lang.IllegalArgumentException): Locations array length must match the parallelism set for the vertex at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.tez.dag.api.Vertex.setTaskLocationsHint(Vertex.java:105) at org.apache.tez.dag.app.DAGAppMaster.startPreWarmContainers(DAGAppMaster.java:1004) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029926#comment-14029926 ] Ashutosh Chauhan commented on HIVE-7209: +1 allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
CVE-2014-0228: Apache Hive Authorization vulnerability
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 CVE-2014-0228: Apache Hive Authorization vulnerability Severity: Moderate Vendor: The Apache Software Foundation Versions affected: Apache Hive 0.13.0 Users affected: Users who have enabled SQL standards based authorization mode. Description: In SQL standards based authorization mode, the URIs used in Hive queries are expected to be authorized on the file system permissions. However, the directory used in import/export statements is not being authorized. This allows a user who knows the directory to which data has been exported to import that data into his table. This is possible if the user HiveServer2 runs as has permissions for that directory and its contents. Mitigation: Users who use SQL standards based authorization should upgrade to 0.13.1. Credit: This issue was discovered by Thejas Nair of Hortonworks. -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.20 (Darwin) Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJTmiJUAAoJENkN9OKO5uMpHmMQAJvyHJetKGdznknT9491liQu 6M0EXQq0dVXWFc5nOzCu9CvuBZgBDeCkxKHM8M/4373clyoxOVGeehxrj0VB4aY8 BPcRDcwY+m16HF1j8W4xSiSFWRtFwedgY7seez9lHihBS0tJmsZ3xYV3mIzgUKVf MkwimimgraQ/Z9Hh5pMuC0IEhk2K8gcGMEOZwYR2VeCI8ycpkAE8Ykx7zABL9Cpa fS5elrGwL1kQ2fCUu+c4UJG8MmNjxWiVohtnmz5VQR7FkJUMirSK4onta7stH7Lx NhibY9ENPmRMwpR0UbEfNOxIm4qvIZL38qNb+DqYZ5s+idoNifdW5MBp0DTxy8NI t9diPNnSqoyZ1wsQckta76NodHKUlcxBKEIgdtSFG0qKKc8tcUTCcW8hfUTvrov/ D29w98Ap2FTHX7O6iAxl+G8JGy01n2j3m3QwQeSYqUwcub7HRb2Dneb92V/1VX5C /z8BEnn1IohEYWSUKDyPNwG41/+oM5BUBGr9uPSA79+kvYeaaL2cVn7Csi3H3U2x fDrQEvBhiptGjX0aS9WWhoeuCUF+PROTN7izFKDtnXJYhd3KqWFj6ccgP3aybVlk iGoekwy5Pp44z9FZzMCibX19qi8ZbAU97lujZXvw9Bn2U+NchXbVEKjlDStlhoom ieaMv2ISHo/5eUqh5kDj =ZFSB -END PGP SIGNATURE- -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
[ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029927#comment-14029927 ] Xuefu Zhang commented on HIVE-7100: --- [~jhsenjaliya] This seems just getting more and more complicated. I'd like to hear other's opinion, but I'm open to: 1. Revert HIVE-6469. 2. Expand the drop table syntax to: DROP TABLE table_name [PURGE] skipTrash is a no-go because it's too implementation specific, but PURGE seems more generic and acceptable, which hides the implementation. Let's wait for others to chime in. Users of hive should be able to specify skipTrash when dropping tables. --- Key: HIVE-7100 URL: https://issues.apache.org/jira/browse/HIVE-7100 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Ravi Prakash Assignee: Jayesh Attachments: HIVE-7100.patch Users of our clusters are often running up against their quota limits because of Hive tables. When they drop tables, they have to then manually delete the files from HDFS using skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly when dropping tables. We should also be able to provide this functionality without polluting SQL syntax. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case
[ https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7201: --- Status: Open (was: Patch Available) Patch needs to be committed to trunk first. So, please provide patch for trunk. Also, it needs to be named as per convention of https://cwiki.apache.org/confluence/display/Hive/Hive+PreCommit+Patch+Testing for automated testing to kick in. Fix TestHiveConf#testConfProperties test case - Key: HIVE-7201 URL: https://issues.apache.org/jira/browse/HIVE-7201 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Pankit Thapar Priority: Minor Attachments: HIVE-7201-1.patch, HIVE-7201-2.patch, HIVE-7201.patch CHANGE 1: TEST CASE : The intention of TestHiveConf#testConfProperties() is to test the HiveConf properties being set in the priority as expected. Each HiveConf object is initialized as follows: 1) Hadoop configuration properties are applied. 2) ConfVar properties with non-null values are overlayed. 3) hive-site.xml properties are overlayed. ISSUE : The mapreduce related configurations are loaded by JobConf and not Configuration. The current test tries to get the configuration properties like : HADOOPNUMREDUCERS (mapred.job.reduces) from Configuration class. But these mapreduce related properties are loaded by JobConf class from mapred-default.xml. DETAILS : LINE 63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, 1); --fails Because, private void checkHadoopConf(String name, String expectedHadoopVal) { Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); Second parameter is null, since its the JobConf class and not the Configuration class that initializes mapred-default values. } Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call like this (in static block): public class JobConf extends Configuration { private static final Log LOG = LogFactory.getLog(JobConf.class); static{ ConfigUtil.loadResources(); -- loads mapreduce related resources (mapreduce-default.xml) } . } Please note, the test case assertion works fine if HiveConf() constructor is called before this assertion since, HiveConf() triggers JobConf() which basically sets the default values of the properties pertaining to mapreduce. This is why, there won't be any failures if testHiveSitePath() was run before testConfProperties() as that would load mapreduce properties into config properties. FIX: Instead of using a Configuration object, we can use the JobConf object to get the default values used by hadoop/mapreduce. CHANGE 2: In TestHiveConf#testHiveSitePath(), a call to static method getHiveSiteLocation() should be called statically instead of using an object. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7105: -- Release Note: Tez shuffle vectorized ReduceRecordReader Status: Patch Available (was: Open) Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029944#comment-14029944 ] Rohini Palaniswamy commented on HIVE-7215: -- I think then this jira can be closed as duplicate of HIVE-4639 Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7215) Support predicate pushdown for null checks in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J resolved HIVE-7215. -- Resolution: Duplicate Duplicate HIVE-4639. Support predicate pushdown for null checks in ORCFile - Key: HIVE-7215 URL: https://issues.apache.org/jira/browse/HIVE-7215 Project: Hive Issue Type: Improvement Reporter: Rohini Palaniswamy Came across this missing feature during discussion of PIG-3760. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029955#comment-14029955 ] Gunther Hagleitner commented on HIVE-7105: -- Comments on rb. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Rajesh Balamohan Assignee: Gopal V Fix For: 0.14.0 Attachments: HIVE-7105.1.patch, HIVE-7105.2.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7203) Optimize limit 0
[ https://issues.apache.org/jira/browse/HIVE-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029957#comment-14029957 ] Ashutosh Chauhan commented on HIVE-7203: Yup, this is an optimization. No need for user docs. Optimize limit 0 Key: HIVE-7203 URL: https://issues.apache.org/jira/browse/HIVE-7203 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.14.0 Attachments: HIVE-7203.1.patch, HIVE-7203.patch Some tools generate queries with limit 0. Lets optimize that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7206) Duplicate declaration of build-helper-maven-plugin in root pom
[ https://issues.apache.org/jira/browse/HIVE-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029956#comment-14029956 ] Ashutosh Chauhan commented on HIVE-7206: Failures are unrelated. Patch is ready for review. Duplicate declaration of build-helper-maven-plugin in root pom -- Key: HIVE-7206 URL: https://issues.apache.org/jira/browse/HIVE-7206 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-7206.1.patch, HIVE-7206.patch Results in following warnings while building: [WARNING] Some problems were encountered while building the effective model for org.apache.hive:hive-it-custom-serde:jar:0.14.0-SNAPSHOT [WARNING] 'build.pluginManagement.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.codehaus.mojo:build-helper-maven-plugin @ org.apache.hive:hive:0.14.0-SNAPSHOT, pom.xml, line 638, column 17 [WARNING] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7183) Size of partColumnGrants should be checked in ObjectStore#removeRole()
[ https://issues.apache.org/jira/browse/HIVE-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029958#comment-14029958 ] Ashutosh Chauhan commented on HIVE-7183: +1 Size of partColumnGrants should be checked in ObjectStore#removeRole() -- Key: HIVE-7183 URL: https://issues.apache.org/jira/browse/HIVE-7183 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7183.patch Here is related code: {code} ListMPartitionColumnPrivilege partColumnGrants = listPrincipalAllPartitionColumnGrants( mRol.getRoleName(), PrincipalType.ROLE); if (tblColumnGrants.size() 0) { pm.deletePersistentAll(partColumnGrants); {code} Size of tblColumnGrants is currently checked. Size of partColumnGrants should be checked instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7206) Duplicate declaration of build-helper-maven-plugin in root pom
[ https://issues.apache.org/jira/browse/HIVE-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029963#comment-14029963 ] Vaibhav Gumashta commented on HIVE-7206: +1 Duplicate declaration of build-helper-maven-plugin in root pom -- Key: HIVE-7206 URL: https://issues.apache.org/jira/browse/HIVE-7206 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-7206.1.patch, HIVE-7206.patch Results in following warnings while building: [WARNING] Some problems were encountered while building the effective model for org.apache.hive:hive-it-custom-serde:jar:0.14.0-SNAPSHOT [WARNING] 'build.pluginManagement.plugins.plugin.(groupId:artifactId)' must be unique but found duplicate declaration of plugin org.codehaus.mojo:build-helper-maven-plugin @ org.apache.hive:hive:0.14.0-SNAPSHOT, pom.xml, line 638, column 17 [WARNING] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029962#comment-14029962 ] Ashutosh Chauhan commented on HIVE-7220: duplicate of HIVE-6401. There is underlying MR bug here. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7182) ResultSet is not closed in JDBCStatsPublisher#init()
[ https://issues.apache.org/jira/browse/HIVE-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029967#comment-14029967 ] Ashutosh Chauhan commented on HIVE-7182: +1 ResultSet is not closed in JDBCStatsPublisher#init() Key: HIVE-7182 URL: https://issues.apache.org/jira/browse/HIVE-7182 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7182.1.patch, HIVE-7182.patch {code} ResultSet rs = dbm.getTables(null, null, JDBCStatsUtils.getStatTableName(), null); boolean tblExists = rs.next(); {code} rs is not closed upon return from init() If stmt.executeUpdate() throws exception, stmt.close() would be skipped - the close() call should be placed in finally block. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7201) Fix TestHiveConf#testConfProperties test case
[ https://issues.apache.org/jira/browse/HIVE-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pankit Thapar updated HIVE-7201: Attachment: HIVE-7201.03.patch Renamed the patch to kick in the autobuild Rebaed the patch to trunk instead of branch-0.13 Fix TestHiveConf#testConfProperties test case - Key: HIVE-7201 URL: https://issues.apache.org/jira/browse/HIVE-7201 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Pankit Thapar Priority: Minor Attachments: HIVE-7201-1.patch, HIVE-7201-2.patch, HIVE-7201.03.patch, HIVE-7201.patch CHANGE 1: TEST CASE : The intention of TestHiveConf#testConfProperties() is to test the HiveConf properties being set in the priority as expected. Each HiveConf object is initialized as follows: 1) Hadoop configuration properties are applied. 2) ConfVar properties with non-null values are overlayed. 3) hive-site.xml properties are overlayed. ISSUE : The mapreduce related configurations are loaded by JobConf and not Configuration. The current test tries to get the configuration properties like : HADOOPNUMREDUCERS (mapred.job.reduces) from Configuration class. But these mapreduce related properties are loaded by JobConf class from mapred-default.xml. DETAILS : LINE 63 : checkHadoopConf(ConfVars.HADOOPNUMREDUCERS.varname, 1); --fails Because, private void checkHadoopConf(String name, String expectedHadoopVal) { Assert.assertEquals(expectedHadoopVal, new Configuration().get(name)); Second parameter is null, since its the JobConf class and not the Configuration class that initializes mapred-default values. } Code that loads mapreduce resources is in ConfigUtil and JobConf makes a call like this (in static block): public class JobConf extends Configuration { private static final Log LOG = LogFactory.getLog(JobConf.class); static{ ConfigUtil.loadResources(); -- loads mapreduce related resources (mapreduce-default.xml) } . } Please note, the test case assertion works fine if HiveConf() constructor is called before this assertion since, HiveConf() triggers JobConf() which basically sets the default values of the properties pertaining to mapreduce. This is why, there won't be any failures if testHiveSitePath() was run before testConfProperties() as that would load mapreduce properties into config properties. FIX: Instead of using a Configuration object, we can use the JobConf object to get the default values used by hadoop/mapreduce. CHANGE 2: In TestHiveConf#testHiveSitePath(), a call to static method getHiveSiteLocation() should be called statically instead of using an object. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf
[ https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030024#comment-14030024 ] Vaibhav Gumashta commented on HIVE-7211: +1 pending tests. Throws exception if the name of conf var starts with hive. does not exists in HiveConf Key: HIVE-7211 URL: https://issues.apache.org/jira/browse/HIVE-7211 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt Some typos in configurations are very hard to find. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7195) Improve Metastore performance
[ https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030047#comment-14030047 ] Chris Drome commented on HIVE-7195: --- We ([~mithun], [~thiruvel], [~selinazh]) have done some work in this area for hive-0.12. Some of the improvements include: 1) Disabling the datanucleus cache to reduce the memory usage in the metastore. 2) Actively close datanucleus query-related resources to allow the memory the be reclaimed. 3) Optimizations to answer metadata-only queries directly from the metastore without launching MR jobs. 4) Optimizations to direct SQL statements. 5) Schema changes to speed up DROP TABLE statements. 6) Added client and server side parameters to restrict the maximum number of partitions that can be retrieved. We are currently looking into: 1) Reducing the client time required to retrieve HDFS file information. 2) Using light-weight partition objects where possible to reduce the time and memory on client/server. If I've forgotten anything Mithun, Thiruvel, or Selina can add more information. Improve Metastore performance - Key: HIVE-7195 URL: https://issues.apache.org/jira/browse/HIVE-7195 Project: Hive Issue Type: Improvement Reporter: Brock Noland Priority: Critical Even with direct SQL, which significantly improves MS performance, some operations take a considerable amount of time, when there are many partitions on table. Specifically I believe the issue: * When a client gets all partitions we do not send them an iterator, we create a collection of all data and then pass the object over the network in total * Operations which require looking up data on the NN can still be slow since there is no cache of information and it's done in a serial fashion * Perhaps a tangent, but our client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. We should use deadlines, i.e. pass the timeout to the server so it can calculate that the client has expired. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5857) Reduce tasks do not work in uber mode in YARN
[ https://issues.apache.org/jira/browse/HIVE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030054#comment-14030054 ] Hive QA commented on HIVE-5857: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649989/HIVE-5857.3.patch {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/448/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/448/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-448/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649989 Reduce tasks do not work in uber mode in YARN - Key: HIVE-5857 URL: https://issues.apache.org/jira/browse/HIVE-5857 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: Adam Kawa Assignee: Adam Kawa Priority: Critical Labels: plan, uber-jar, uberization, yarn Fix For: 0.13.0 Attachments: HIVE-5857.1.patch.txt, HIVE-5857.2.patch, HIVE-5857.3.patch A Hive query fails when it tries to run a reduce task in uber mode in YARN. The NullPointerException is thrown in the ExecReducer.configure method, because the plan file (reduce.xml) for a reduce task is not found. The Utilities.getBaseWork method is expected to return BaseWork object, but it returns NULL due to FileNotFoundException. {code} // org.apache.hadoop.hive.ql.exec.Utilities public static BaseWork getBaseWork(Configuration conf, String name) { ... try { ... if (gWork == null) { Path localPath; if (ShimLoader.getHadoopShims().isLocalMode(conf)) { localPath = path; } else { localPath = new Path(name); } InputStream in = new FileInputStream(localPath.toUri().getPath()); BaseWork ret = deserializePlan(in); } return gWork; } catch (FileNotFoundException fnf) { // happens. e.g.: no reduce work. LOG.debug(No plan file found: +path); return null; } ... } {code} It happens because, the ShimLoader.getHadoopShims().isLocalMode(conf)) method returns true, because immediately before running a reduce task, org.apache.hadoop.mapred.LocalContainerLauncher changes its configuration to local mode (mapreduce.framework.name is changed from yarn to local). On the other hand map tasks run successfully, because its configuration is not changed and still remains yarn. {code} // org.apache.hadoop.mapred.LocalContainerLauncher private void runSubtask(..) { ... conf.set(MRConfig.FRAMEWORK_NAME, MRConfig.LOCAL_FRAMEWORK_NAME); conf.set(MRConfig.MASTER_ADDRESS, local); // bypass shuffle ReduceTask reduce = (ReduceTask)task; reduce.setConf(conf); reduce.run(conf, umbilical); } {code} A super quick fix could just an additional if-branch, where we check if we run a reduce task in uber mode, and then look for a plan file in a different location. *Java stacktrace* {code} 2013-11-20 00:50:56,862 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: hdfs://namenode.c.lon.spotify.net:54310/var/tmp/kawaa/hive_2013-11-20_00-50-43_888_3938384086824086680-2/-mr-10003/e3caacf6-15d6-4987-b186-d2906791b5b0/reduce.xml 2013-11-20 00:50:56,862 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local (uberized) 'child' : java.lang.RuntimeException: Error in configuring object at
[jira] [Commented] (HIVE-7182) ResultSet is not closed in JDBCStatsPublisher#init()
[ https://issues.apache.org/jira/browse/HIVE-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030061#comment-14030061 ] Hive QA commented on HIVE-7182: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649710/HIVE-7182.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/449/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/449/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-449/ Messages: {noformat} This message was trimmed, see log for full details As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:68:4: Decision can match input such as LPAREN KW_CASE KW_ARRAY using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:68:4: Decision can match input such as LPAREN KW_CASE TinyintLiteral using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:68:4: Decision can match input such as LPAREN KW_CASE KW_STRUCT using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:68:4: Decision can match input such as LPAREN KW_CASE SmallintLiteral using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:115:5: Decision can match input such as KW_CLUSTER KW_BY LPAREN using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:127:5: Decision can match input such as KW_PARTITION KW_BY LPAREN using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:138:5: Decision can match input such as KW_DISTRIBUTE KW_BY LPAREN using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:149:5: Decision can match input such as KW_SORT KW_BY LPAREN using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:166:7: Decision can match input such as STAR using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:179:5: Decision can match input such as KW_STRUCT using multiple alternatives: 4, 6 As a result, alternative(s) 6 were disabled for that input warning(200): IdentifiersParser.g:179:5: Decision can match input such as KW_UNIONTYPE using multiple alternatives: 5, 6 As a result, alternative(s) 6 were disabled for that input warning(200): IdentifiersParser.g:179:5: Decision can match input such as KW_ARRAY using multiple alternatives: 2, 6 As a result, alternative(s) 6 were disabled for that input warning(200): IdentifiersParser.g:261:5: Decision can match input such as KW_DATE StringLiteral using multiple alternatives: 2, 3 As a result, alternative(s) 3 were disabled for that input warning(200): IdentifiersParser.g:261:5: Decision can match input such as KW_FALSE using multiple alternatives: 3, 8 As a result, alternative(s) 8 were disabled for that input warning(200): IdentifiersParser.g:261:5: Decision can match input such as KW_TRUE using multiple alternatives: 3, 8 As a result, alternative(s) 8 were disabled for that input warning(200): IdentifiersParser.g:261:5: Decision can match input such as KW_NULL using multiple alternatives: 1, 8 As a result, alternative(s) 8 were disabled for that input warning(200): IdentifiersParser.g:393:5: Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT KW_OVERWRITE using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:393:5: Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_DISTRIBUTE KW_BY using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:393:5: Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_MAP LPAREN using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:393:5: Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_INSERT KW_INTO using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:393:5: Decision can match input such as {KW_LIKE, KW_REGEXP, KW_RLIKE} KW_LATERAL KW_VIEW using multiple alternatives: 2, 9 As a result, alternative(s) 9 were disabled for that input warning(200):
[jira] [Updated] (HIVE-7208) move SearchArgument interface into serde package
[ https://issues.apache.org/jira/browse/HIVE-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-7208: --- Attachment: HIVE-7208.01.patch patch that retains package name. Builds for me... will look if it fails again move SearchArgument interface into serde package Key: HIVE-7208 URL: https://issues.apache.org/jira/browse/HIVE-7208 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-7208.01.patch, HIVE-7208.patch For usage in alternative input formats/serdes, it might be useful to move SearchArgument class to a place that is not in ql (because it's hard to depend on ql). -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22033: HIVE-7094: Separate static and dynamic partitioning implementations from FileRecordWriterContainer.
On June 12, 2014, 9:19 p.m., Carl Steinbach wrote: hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicFileRecordWriterContainer.java, line 58 https://reviews.apache.org/r/22033/diff/1/?file=598889#file598889line58 Is this the result of automated formatting or something that you're doing by hand? I formatted this by hand because it is more readable to me this way. On June 12, 2014, 9:19 p.m., Carl Steinbach wrote: hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticFileRecordWriterContainer.java, line 34 https://reviews.apache.org/r/22033/diff/1/?file=598892#file598892line34 Please make it clear in the comment that static refers to static partitions. Also, does it make sense to change the name to StaticPartitionFileRecordWriterContainer? Extremely verbose but it gets the point across and avoids confusion. I have updated the comments. That is a good idea. I have renamed the two classes to StaticPartitionFileRecordWriterContainer and DynamicPartitionFileRecordWriterContainer. - David --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22033/#review45537 --- On May 29, 2014, 7:33 p.m., David Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22033/ --- (Updated May 29, 2014, 7:33 p.m.) Review request for hive. Bugs: HIVE-7094 https://issues.apache.org/jira/browse/HIVE-7094 Repository: hive-git Description --- HIVE-7093: Separate static and dynamic partitioning implementations from FileRecordWriterContainer. Diffs - hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicFileRecordWriterContainer.java PRE-CREATION hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java e9ca263abade20b7423ad98695807a60ab957ead hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java b55a05528d5a4eed114b5628697cf5a60f6c6cbc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticFileRecordWriterContainer.java PRE-CREATION Diff: https://reviews.apache.org/r/22033/diff/ Testing --- Thanks, David Chen
[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline
[ https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030092#comment-14030092 ] Lefty Leverenz commented on HIVE-7224: -- [~vaibhavgumashta], thanks for documenting --incremental in the wiki. We can add the default with version information after this jira commits. Do you have time to deal with other Beeline doc issues that I raised in a comment on HIVE-6173? * [HiveServer2 Clients: Beeline Command Options (--incremental) | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions] * [HIVE-6173 comment: Beeline doc issues | https://issues.apache.org/jira/browse/HIVE-6173?focusedCommentId=13888556page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13888556] Set incremental printing to true by default in Beeline -- Key: HIVE-7224 URL: https://issues.apache.org/jira/browse/HIVE-7224 Project: Hive Issue Type: Bug Components: Clients, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.0 Attachments: HIVE-7224.1.patch See HIVE-7221. By default beeline tries to buffer the entire output relation before printing it on stdout. This can cause OOM when the output relation is large. However, beeline has the option of incremental prints. We should keep that as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7212: - Attachment: HIVE-7212.2.patch .2 let's the one user re-use localized files across applications. Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030105#comment-14030105 ] Gunther Hagleitner commented on HIVE-7212: -- Yes, [~sershe] - you're right. I'd like to go with this one (the other needs some work/doesn't work anymore.) If you feel there's stuff missing from that one - can you point out? I'll port it over. Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7212: - Status: Open (was: Patch Available) Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7212: - Attachment: HIVE-7212.3.patch .3 addresses review comments. Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch, HIVE-7212.3.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6824) Hive HBase query fails on Tez due to missing jars - part 2
[ https://issues.apache.org/jira/browse/HIVE-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6824: - Resolution: Duplicate Status: Resolved (was: Patch Available) Hive HBase query fails on Tez due to missing jars - part 2 -- Key: HIVE-6824 URL: https://issues.apache.org/jira/browse/HIVE-6824 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-6824.patch Follow-up from HIVE-6739. We cannot wait for Tez 0.4 (or even be sure that it will have TEZ-1004 and TEZ-1005), so I will split the patch into two. Original jira will have the straightforward (but less efficient) fix. This jira will use new relocalize APIs. -Depending on relative timing of Tez 0.4 release and Hive 0.13 release, this will go into 0.13 or 0.14- blocked on Tez 0.5 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7212) Use resource re-localization instead of restarting sessions in Tez
[ https://issues.apache.org/jira/browse/HIVE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7212: - Status: Patch Available (was: Open) Use resource re-localization instead of restarting sessions in Tez -- Key: HIVE-7212 URL: https://issues.apache.org/jira/browse/HIVE-7212 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7212.1.patch, HIVE-7212.2.patch, HIVE-7212.3.patch scriptfile1.q is failing on Tez because of a recent breakage in localization. On top of that we're currently restarting sessions if the resources have changed. (add file/add jar/etc). Instead of doing this we should just have tez relocalize these new resources. This way no session/AM restart is required. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7195) Improve Metastore performance
[ https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030119#comment-14030119 ] Sergey Shelukhin commented on HIVE-7195: Are there patches for these in JIRA? I remember there's jira for cascading drop Improve Metastore performance - Key: HIVE-7195 URL: https://issues.apache.org/jira/browse/HIVE-7195 Project: Hive Issue Type: Improvement Reporter: Brock Noland Priority: Critical Even with direct SQL, which significantly improves MS performance, some operations take a considerable amount of time, when there are many partitions on table. Specifically I believe the issue: * When a client gets all partitions we do not send them an iterator, we create a collection of all data and then pass the object over the network in total * Operations which require looking up data on the NN can still be slow since there is no cache of information and it's done in a serial fashion * Perhaps a tangent, but our client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. We should use deadlines, i.e. pass the timeout to the server so it can calculate that the client has expired. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22033: HIVE-7094: Separate static and dynamic partitioning implementations from FileRecordWriterContainer.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22033/ --- (Updated June 13, 2014, 1:14 a.m.) Review request for hive. Changes --- Address review comments. Bugs: HIVE-7094 https://issues.apache.org/jira/browse/HIVE-7094 Repository: hive-git Description (updated) --- HIVE-7094: Separate static and dynamic partitioning implementations from FileRecordWriterContainer. Diffs (updated) - hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java PRE-CREATION hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java e9ca263abade20b7423ad98695807a60ab957ead hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java b55a05528d5a4eed114b5628697cf5a60f6c6cbc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java PRE-CREATION Diff: https://reviews.apache.org/r/22033/diff/ Testing --- Thanks, David Chen
[jira] [Commented] (HIVE-7094) Separate out static/dynamic partitioning code in FileRecordWriterContainer
[ https://issues.apache.org/jira/browse/HIVE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030124#comment-14030124 ] David Chen commented on HIVE-7094: -- Thanks for taking a look, [~cwsteinbach]. I have updated the RB with a new revision. Separate out static/dynamic partitioning code in FileRecordWriterContainer -- Key: HIVE-7094 URL: https://issues.apache.org/jira/browse/HIVE-7094 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7094.1.patch There are two major places in FileRecordWriterContainer that have the {{if (dynamicPartitioning)}} condition: the constructor and write(). This is the approach that I am taking: # Move the DP and SP code into two subclasses: DynamicFileRecordWriterContainer and StaticFileRecordWriterContainer. # Make FileRecordWriterContainer an abstract class that contains the common code for both implementations. For write(), FileRecordWriterContainer will call an abstract method that will provide the local RecordWriter, ObjectInspector, SerDe, and OutputJobInfo. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7195) Improve Metastore performance
[ https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030131#comment-14030131 ] Mithun Radhakrishnan commented on HIVE-7195: [~sershe]: I'm sorry, I've not found the time to port my patch to 13 and raise a JIRA. My work was primarily in the PartitionPruner code. It was to ensure that {{listPartitions(db, table, -1)}} isn't called (during plan optimization), if the call is a metadata-only query. I can post the 12-patch in a JIRA, whatever that's worth. Incidentally, I've raised HIVE-7223 to discuss the idea of using {{PartitionSpecs}}. [~alangates] suggested that we explore if a PartitionSpec abstract could also represent lighter Partition-groups that share commonality (StorageDescs, etc.). Still thinking that through. (If only Thrift supported polymorphism. :]) Improve Metastore performance - Key: HIVE-7195 URL: https://issues.apache.org/jira/browse/HIVE-7195 Project: Hive Issue Type: Improvement Reporter: Brock Noland Priority: Critical Even with direct SQL, which significantly improves MS performance, some operations take a considerable amount of time, when there are many partitions on table. Specifically I believe the issue: * When a client gets all partitions we do not send them an iterator, we create a collection of all data and then pass the object over the network in total * Operations which require looking up data on the NN can still be slow since there is no cache of information and it's done in a serial fashion * Perhaps a tangent, but our client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. We should use deadlines, i.e. pass the timeout to the server so it can calculate that the client has expired. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
[ https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030132#comment-14030132 ] David Chen commented on HIVE-7110: -- It looks like this issue is not OS X-specific after all. I am also hitting it on RHEL 6.4. Interestingly, when I run the test by itself, it passes. However, it fails when I run it with all the other tests. It is possible that one of the previous tests is writing a new hive-site file somewhere in the classpath that does not set this property, and this file is getting picked up by this test instead of the one that the test is supposed to be using. I will dig into this some more. TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile - Key: HIVE-7110 URL: https://issues.apache.org/jira/browse/HIVE-7110 Project: Hive Issue Type: Bug Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, HIVE-7110.4.patch I got the following TestHCatPartitionPublish test failure when running all unit tests against Hadoop 1. This also appears when testing against Hadoop 2. {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish) Time elapsed: 1.361 sec ERROR! org.apache.hive.hcatalog.common.HCatException: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : java.io.IOException: No FileSystem for scheme: pfile at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf
[ https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7211: Attachment: HIVE-7211.3.patch.txt Throws exception if the name of conf var starts with hive. does not exists in HiveConf Key: HIVE-7211 URL: https://issues.apache.org/jira/browse/HIVE-7211 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, HIVE-7211.3.patch.txt Some typos in configurations are very hard to find. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
[ https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reopened HIVE-3392: - Assignee: Navis (was: Ajesh Kumar) Hive unnecessarily validates table SerDes when dropping a table --- Key: HIVE-3392 URL: https://issues.apache.org/jira/browse/HIVE-3392 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Jonathan Natkins Assignee: Navis Labels: patch Attachments: HIVE-3392.2.patch.txt, HIVE-3392.Test Case - with_trunk_version.txt natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260) ... 20 more hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive drop table test; OK Time taken: 0.658 seconds hive -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
[ https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3392: Status: Patch Available (was: Reopened) Hive unnecessarily validates table SerDes when dropping a table --- Key: HIVE-3392 URL: https://issues.apache.org/jira/browse/HIVE-3392 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Jonathan Natkins Assignee: Navis Labels: patch Attachments: HIVE-3392.2.patch.txt, HIVE-3392.3.patch.txt, HIVE-3392.Test Case - with_trunk_version.txt natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260) ... 20 more hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive drop table test; OK Time taken: 0.658 seconds hive -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
[ https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3392: Attachment: HIVE-3392.3.patch.txt Hive unnecessarily validates table SerDes when dropping a table --- Key: HIVE-3392 URL: https://issues.apache.org/jira/browse/HIVE-3392 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Jonathan Natkins Assignee: Navis Labels: patch Attachments: HIVE-3392.2.patch.txt, HIVE-3392.3.patch.txt, HIVE-3392.Test Case - with_trunk_version.txt natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260) ... 20 more hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive drop table test; OK Time taken: 0.658 seconds hive -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030178#comment-14030178 ] Navis commented on HIVE-7220: - Can we just remove this test? Who makes external table on root directory? Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7211) Throws exception if the name of conf var starts with hive. does not exists in HiveConf
[ https://issues.apache.org/jira/browse/HIVE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030193#comment-14030193 ] Hive QA commented on HIVE-7211: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649986/HIVE-7211.2.patch.txt {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_compression org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compression org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_overridden_confs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats2 org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/450/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/450/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-450/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12649986 Throws exception if the name of conf var starts with hive. does not exists in HiveConf Key: HIVE-7211 URL: https://issues.apache.org/jira/browse/HIVE-7211 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7211.1.patch.txt, HIVE-7211.2.patch.txt, HIVE-7211.3.patch.txt Some typos in configurations are very hard to find. -- This message was sent by Atlassian JIRA (v6.2#6252)