date:20170330

[jira] [Assigned] (HIVE-16343) LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring

2017-03-30 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-16343:



> LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring
> 
>
> Key: HIVE-16343
> URL: https://issues.apache.org/jira/browse/HIVE-16343
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Publish MemInfo from ProcfsBasedProcessTree to llap metrics. This will useful 
> for monitoring and also setting up triggers via JMC. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (HIVE-8554) Hive Server 2 should support multiple authentication types at the same time

2017-03-30 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HIVE-8554.
---
Resolution: Duplicate

> Hive Server 2 should support multiple authentication types at the same time
> ---
>
> Key: HIVE-8554
> URL: https://issues.apache.org/jira/browse/HIVE-8554
> Project: Hive
>  Issue Type: Bug
>Reporter: Joey Echeverria
>
> It's very common for clusters to use LDAP/Active Directory as an identity 
> provider for a cluster while using Kerberos authentication. It would be 
> useful if users could seamlessly switch between using LDAP username/password 
> authentication and Kerberos authentication without having to run multiple 
> Hive Server 2 instances.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15396) Basic Stats are not collected when for managed tables with LOCATION specified

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15396:

Attachment: HIVE-15396.5.patch

> Basic Stats are not collected when for managed tables with LOCATION specified
> -
>
> Key: HIVE-15396
> URL: https://issues.apache.org/jira/browse/HIVE-15396
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15396.1.patch, HIVE-15396.2.patch, 
> HIVE-15396.3.patch, HIVE-15396.4.patch, HIVE-15396.5.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:1> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:1> describe formatted hdfs_1;
> +---++-+
> |   col_name| data_type   
>|   comment   |
> +---++-+
> | # col_name| data_type   
>| comment |
> |   | NULL
>| NULL|
> | col   | int 
>| |
> |   | NULL
>| NULL|
> | # Detailed Table Information  | NULL
>| NULL|
> | Database: | default 
>| NULL|
> | Owner:| anonymous   
>| NULL|
> | CreateTime:   | Wed Mar 22 18:09:19 PDT 2017
>| NULL|
> | LastAccessTime:   | UNKNOWN 
>| NULL|
> | Retention:| 0   
>| NULL|
> | Location: | file:/warehouse/hdfs_1 | NULL   
>  |
> | Table Type:   | MANAGED_TABLE   
>| NULL|
> | Table Parameters: | NULL
>| NULL|
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\"}  |
> |   | numFiles
>| 0   |
> |   | numRows 
>| 0   |
> |   | rawDataSize 
>| 0   |
> |   | totalSize   
>| 0   |
> |   | transient_lastDdlTime   
>| 1490231359  |
> |   | NULL
>| NULL|
> | # Storage Information | NULL
>| NULL|
> | SerDe Library:| 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL 
>|
> | InputFormat:  | org.apache.hadoop.mapred.TextInputFormat
>| NULL|
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL 
>|
> | Compressed:   | No  
>| NULL|
> | Num Buckets:  | -1  
>| NULL|
> | Bucket Columns:   | []  
>| NULL|
> | Sort Columns: | []  
>| NULL|
> | Storage Desc Params:  | NULL
>| NULL|
> |   | serialization.format
>

[jira] [Commented] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950322#comment-15950322
 ] 

Hive QA commented on HIVE-16249:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861335/HIVE-16249.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10541 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid (batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg (batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeProxyAuth 
(batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=232)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=232)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4479/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4479/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4479/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861335 - PreCommit-HIVE-Build

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Attachment: HIVE-16299.04.patch

Update to the patch : removed trailing space

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch, 
> HIVE-16299.03.patch, HIVE-16299.04.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Attachment: HIVE-16299.03.patch

Fixed an issue for the skip mode. msck doesn't need to list any further if the 
partition directory is invalid even in skip mode.

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch, 
> HIVE-16299.03.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16299:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vihang!

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16061) When hive.async.log.enabled is set to true, some output is not printed to the beeline console

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950283#comment-15950283
 ] 

Hive QA commented on HIVE-16061:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861332/HIVE-16061.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4478/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4478/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4478/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861332 - PreCommit-HIVE-Build

> When hive.async.log.enabled is set to true, some output is not printed to the 
> beeline console
> -
>
> Key: HIVE-16061
> URL: https://issues.apache.org/jira/browse/HIVE-16061
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16061.1.patch, HIVE-16061.2.patch, 
> HIVE-16061.3.patch, HIVE-16061.4.patch
>
>
> Run a hiveserver2 instance "hive --service hiveserver2".
> Then from another console, connect to hiveserver2 "beeline -u 
> "jdbc:hive2://localhost:1"
> When you run a MR job like "select t1.key from src t1 join src t2 on 
> t1.key=t2.key", some of the console logs like MR job info are not printed to 
> the console while it just print to the hiveserver2 console.
> When hive.async.log.enabled is set to false and restarts the HiveServer2, 
> then the output will be printed to the beeline console.
> OperationLog implementation uses the ThreadLocal variable to store associated 
> the log file. When the hive.async.log.enabled is set to true, the logs will 
> be processed by a ThreadPool and  the actual threads from the pool which 
> prints the message won't be able to access the log file stored in the 
> original thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16315) Describe table doesn't show num of partitions

2017-03-30 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-16315:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   2.3.0
   Status: Resolved  (was: Patch Available)

Pushed to master and branch-2. Thanks [~ashutoshc] for the review.

> Describe table doesn't show num of partitions
> -
>
> Key: HIVE-16315
> URL: https://issues.apache.org/jira/browse/HIVE-16315
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16315.1.patch, HIVE-16315.2.patch
>
>
> This doesn't comply with our wiki: 
> https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Examples



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950261#comment-15950261
 ] 

Ashutosh Chauhan commented on HIVE-16299:
-

+1

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950239#comment-15950239
 ] 

Hive QA commented on HIVE-16299:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861352/HIVE-16299.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10544 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4477/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4477/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4477/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861352 - PreCommit-HIVE-Build

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16307) add IO memory usage report to LLAP UI

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950183#comment-15950183
 ] 

Hive QA commented on HIVE-16307:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861326/HIVE-16307.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteChar (batchId=175)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDate2 (batchId=175)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4476/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4476/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4476/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861326 - PreCommit-HIVE-Build

> add IO memory usage report to LLAP UI
> -
>
> Key: HIVE-16307
> URL: https://issues.apache.org/jira/browse/HIVE-16307
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16307.01.patch, HIVE-16307.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property

2017-03-30 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950182#comment-15950182
 ] 

Chaoyu Tang commented on HIVE-15880:


The patch looks good to me, +1.

> Allow insert overwrite and truncate table query to use auto.purge table 
> property
> 
>
> Key: HIVE-15880
> URL: https://issues.apache.org/jira/browse/HIVE-15880
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, 
> HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, 
> HIVE-15880.06.patch
>
>
> It seems inconsistent that auto.purge property is not considered when we do a 
> INSERT OVERWRITE while it is when we do a DROP TABLE
> Drop table doesn't move table data to Trash when auto.purge is set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> No rows affected (0.064 seconds)
> > alter table temp set tblproperties('auto.purge'='true');
> No rows affected (0.083 seconds)
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> No rows affected (25.473 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:03 
> /user/hive/warehouse/temp/00_0
> #
> > drop table temp;
> No rows affected (0.242 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> ls: `/user/hive/warehouse/temp': No such file or directory
> #
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> #
> {noformat}
> INSERT OVERWRITE query moves the table data to Trash even when auto.purge is 
> set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> > alter table temp set tblproperties('auto.purge'='true');
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:07 
> /user/hive/warehouse/temp/00_0
> #
> > insert overwrite table temp select * from dummy;
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 26 2017-02-09 13:08 
> /user/hive/warehouse/temp/00_0
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> Found 1 items
> drwx--   - hive hive  0 2017-02-09 13:08 
> /user/hive/.Trash/Current/user/hive/warehouse/temp
> #
> {noformat}
> While move operations are not very costly on HDFS it could be significant 
> overhead on slow FileSystems like S3. This could improve the performance of 
> {{INSERT OVERWRITE TABLE}} queries especially when there are large number of 
> partitions on tables located on S3 should the user wish to set auto.purge 
> property to true
> Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property 
> set true should not move the data to Trash



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16277) Exchange Partition between filesystems throws "IllegalArgumentException Wrong FS"

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16277:

Attachment: HIVE-16277.3.patch

Initial round of re-factoring done.

> Exchange Partition between filesystems throws "IllegalArgumentException Wrong 
> FS"
> -
>
> Key: HIVE-16277
> URL: https://issues.apache.org/jira/browse/HIVE-16277
> Project: Hive
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16277.1.patch, HIVE-16277.2.patch, 
> HIVE-16277.3.patch
>
>
> The following query: {{alter table s3_tbl exchange partition (country='USA') 
> with table hdfs_tbl}} fails with the following exception:
> {code}
> Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> java.lang.IllegalArgumentException Wrong FS: 
> s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:379)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:256)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:347)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:361)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong 
> FS: s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.exchangeTablePartitions(Hive.java:3553)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.exchangeTablePartition(DDLTask.java:4691)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:570)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2182)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1838)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1525)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1231)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:254)
>   ... 11 more
> Caused by: MetaException(message:Got exception: 
> java.lang.IllegalArgumentException Wrong FS: 
> s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1387)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.renameDir(Warehouse.java:208)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.renameDir(Warehouse.java:200)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.exchange_partitions(HiveMetaStore.java:2967)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.exchange_partitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.exchange_partitions(HiveMetaStoreClient.java:690)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
>

[jira] [Commented] (HIVE-16341) Tez Task Execution Summary has incorrect input record counts on some operators

2017-03-30 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950138#comment-15950138
 ] 

Gopal V commented on HIVE-16341:


[~jdere]: I think the existing codepath assumes 
{{tez.task.generate.counters.per.io=false}}

Fixing this correctly requires per-io counters to be always enabled (check that 
and if/else the counter checks?).

> Tez Task Execution Summary has incorrect input record counts on some operators
> --
>
> Key: HIVE-16341
> URL: https://issues.apache.org/jira/browse/HIVE-16341
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16341.1.patch
>
>
> {noformat}
> Task Execution Summary
> 
>   VERTICES  TOTAL_TASKS  FAILED_ATTEMPTS  KILLED_TASKS   DURATION(ms)  
> CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  OUTPUT_RECORDS
> 
>  Map 1  1670 0   17640.00 
> 2,109,200   23,068150,000,004  11,995,136
> Map 1150 0   10559.00
> 71,960  633  4,023,690 799,900
> Map 1310 02244.00 
> 6,090   29 25   3
>  Map 310 02849.00 
> 7,080   99 25   3
>  Map 5  2710 0   55834.00
> 12,934,890  358,376  1,500,000,001   1,500,000,161
>  Map 7  2410 0   91243.00 
> 5,020,860   71,182  1,827,250,341 652,413,443
> Reducer 1010 01010.00 
> 1,9000  4   0
> Reducer 1210 03854.00 
> 1,3200799,900   1
> Reducer 1410 01420.00 
> 3,790   45  3   1
>  Reducer 210 09720.00 
> 6,220  122 11,995,136   1
>  Reducer 410 0 810.00 
> 2,100  105  3   1
>  Reducer 610 0   24863.00 
> 3,2605  1,500,000,161   1
>  Reducer 8  4120 0   88215.00
> 17,106,440  184,524  2,165,208,640   1,864
>  Reducer 920 0   29752.00 
> 3,9800  1,864   4
> 
> {noformat}
> Seeing this on queries using runtime filtering. Noticed the INPUT_RECORDS 
> look incorrect for the reducers that are responsible for aggregating the 
> min/max/bloomfilter (Reducers 12, 14, 2, 6). For example Reducer 2 shows 12M 
> input records. However looking at the task logs for Reducer 2, there were 
> only 167 input records.
> It looks like Map 1 has 2 different output vertices (Reducer 2 and Reducer 
> 8), but the total output rows for Map 1 (rather than just the rows going to 
> each specific vertex) is being counted in the input rows for both Reducer 2 
> and Reducer 8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16293) Column pruner should continue to work when SEL has more than 1 child

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16293:
---
Attachment: HIVE-16293.02.patch

address [~ashutoshc]'s comments.

> Column pruner should continue to work when SEL has more than 1 child
> 
>
> Key: HIVE-16293
> URL: https://issues.apache.org/jira/browse/HIVE-16293
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16293.01.patch, HIVE-16293.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16333) remove the redundant symbol "\" to appear red in sublime text 3

2017-03-30 Thread Saijin Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16333:

Attachment: HIVE-16333.2.patch

> remove the redundant symbol "\" to appear red in sublime text 3
> ---
>
> Key: HIVE-16333
> URL: https://issues.apache.org/jira/browse/HIVE-16333
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-16333.1.patch, HIVE-16333.2.patch
>
>
> In TxnHandler.java,i found a  redundant symbol "\" in function getOpenTxns()  
> whch leads to appear red in sublime text 3.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16293) Column pruner should continue to work when SEL has more than 1 child

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16293:
---
Status: Open  (was: Patch Available)

> Column pruner should continue to work when SEL has more than 1 child
> 
>
> Key: HIVE-16293
> URL: https://issues.apache.org/jira/browse/HIVE-16293
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16293.01.patch, HIVE-16293.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16293) Column pruner should continue to work when SEL has more than 1 child

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16293:
---
Status: Patch Available  (was: Open)

> Column pruner should continue to work when SEL has more than 1 child
> 
>
> Key: HIVE-16293
> URL: https://issues.apache.org/jira/browse/HIVE-16293
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16293.01.patch, HIVE-16293.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16333) remove the redundant symbol "\" to appear red in sublime text 3

2017-03-30 Thread Saijin Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16333:

Description: 
In TxnHandler.java,i found a  redundant symbol "\" in function getOpenTxns()  
whch leads to appear red in sublime text 3.


  was:
In TxnHandler.java,i found a  redundant symbol "\" in function getOpenTxns()  
whch leads to appear red in sublime text 3.
!1.PNG!


> remove the redundant symbol "\" to appear red in sublime text 3
> ---
>
> Key: HIVE-16333
> URL: https://issues.apache.org/jira/browse/HIVE-16333
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-16333.1.patch
>
>
> In TxnHandler.java,i found a  redundant symbol "\" in function getOpenTxns()  
> whch leads to appear red in sublime text 3.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16333) remove the redundant symbol "\" to appear red in sublime text 3

2017-03-30 Thread Saijin Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saijin Huang updated HIVE-16333:

Attachment: (was: 1.PNG)

> remove the redundant symbol "\" to appear red in sublime text 3
> ---
>
> Key: HIVE-16333
> URL: https://issues.apache.org/jira/browse/HIVE-16333
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-16333.1.patch
>
>
> In TxnHandler.java,i found a  redundant symbol "\" in function getOpenTxns()  
> whch leads to appear red in sublime text 3.
> !1.PNG!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950131#comment-15950131
 ] 

Hive QA commented on HIVE-16329:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861318/HIVE-16329.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4475/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4475/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4475/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861318 - PreCommit-HIVE-Build

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch, HIVE-16329.2.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16341) Tez Task Execution Summary has incorrect input record counts on some operators

2017-03-30 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16341:
--
Status: Patch Available  (was: Open)

> Tez Task Execution Summary has incorrect input record counts on some operators
> --
>
> Key: HIVE-16341
> URL: https://issues.apache.org/jira/browse/HIVE-16341
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16341.1.patch
>
>
> {noformat}
> Task Execution Summary
> 
>   VERTICES  TOTAL_TASKS  FAILED_ATTEMPTS  KILLED_TASKS   DURATION(ms)  
> CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  OUTPUT_RECORDS
> 
>  Map 1  1670 0   17640.00 
> 2,109,200   23,068150,000,004  11,995,136
> Map 1150 0   10559.00
> 71,960  633  4,023,690 799,900
> Map 1310 02244.00 
> 6,090   29 25   3
>  Map 310 02849.00 
> 7,080   99 25   3
>  Map 5  2710 0   55834.00
> 12,934,890  358,376  1,500,000,001   1,500,000,161
>  Map 7  2410 0   91243.00 
> 5,020,860   71,182  1,827,250,341 652,413,443
> Reducer 1010 01010.00 
> 1,9000  4   0
> Reducer 1210 03854.00 
> 1,3200799,900   1
> Reducer 1410 01420.00 
> 3,790   45  3   1
>  Reducer 210 09720.00 
> 6,220  122 11,995,136   1
>  Reducer 410 0 810.00 
> 2,100  105  3   1
>  Reducer 610 0   24863.00 
> 3,2605  1,500,000,161   1
>  Reducer 8  4120 0   88215.00
> 17,106,440  184,524  2,165,208,640   1,864
>  Reducer 920 0   29752.00 
> 3,9800  1,864   4
> 
> {noformat}
> Seeing this on queries using runtime filtering. Noticed the INPUT_RECORDS 
> look incorrect for the reducers that are responsible for aggregating the 
> min/max/bloomfilter (Reducers 12, 14, 2, 6). For example Reducer 2 shows 12M 
> input records. However looking at the task logs for Reducer 2, there were 
> only 167 input records.
> It looks like Map 1 has 2 different output vertices (Reducer 2 and Reducer 
> 8), but the total output rows for Map 1 (rather than just the rows going to 
> each specific vertex) is being counted in the input rows for both Reducer 2 
> and Reducer 8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16341) Tez Task Execution Summary has incorrect input record counts on some operators

2017-03-30 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16341:
--
Attachment: HIVE-16341.1.patch

It looks like there are Tez counters which show the # output records going from 
one vertex to another vertex - for example group 
TaskCounter_Map_1_OUTPUT_Reducer_4, counter OUTPUT_BYTES. Attaching patch to 
use this rather than the total intermediate rows for the vertex (which may 
include rows going to other vertices).

cc [~sseth] [~prasanth_j]

> Tez Task Execution Summary has incorrect input record counts on some operators
> --
>
> Key: HIVE-16341
> URL: https://issues.apache.org/jira/browse/HIVE-16341
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16341.1.patch
>
>
> {noformat}
> Task Execution Summary
> 
>   VERTICES  TOTAL_TASKS  FAILED_ATTEMPTS  KILLED_TASKS   DURATION(ms)  
> CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  OUTPUT_RECORDS
> 
>  Map 1  1670 0   17640.00 
> 2,109,200   23,068150,000,004  11,995,136
> Map 1150 0   10559.00
> 71,960  633  4,023,690 799,900
> Map 1310 02244.00 
> 6,090   29 25   3
>  Map 310 02849.00 
> 7,080   99 25   3
>  Map 5  2710 0   55834.00
> 12,934,890  358,376  1,500,000,001   1,500,000,161
>  Map 7  2410 0   91243.00 
> 5,020,860   71,182  1,827,250,341 652,413,443
> Reducer 1010 01010.00 
> 1,9000  4   0
> Reducer 1210 03854.00 
> 1,3200799,900   1
> Reducer 1410 01420.00 
> 3,790   45  3   1
>  Reducer 210 09720.00 
> 6,220  122 11,995,136   1
>  Reducer 410 0 810.00 
> 2,100  105  3   1
>  Reducer 610 0   24863.00 
> 3,2605  1,500,000,161   1
>  Reducer 8  4120 0   88215.00
> 17,106,440  184,524  2,165,208,640   1,864
>  Reducer 920 0   29752.00 
> 3,9800  1,864   4
> 
> {noformat}
> Seeing this on queries using runtime filtering. Noticed the INPUT_RECORDS 
> look incorrect for the reducers that are responsible for aggregating the 
> min/max/bloomfilter (Reducers 12, 14, 2, 6). For example Reducer 2 shows 12M 
> input records. However looking at the task logs for Reducer 2, there were 
> only 167 input records.
> It looks like Map 1 has 2 different output vertices (Reducer 2 and Reducer 
> 8), but the total output rows for Map 1 (rather than just the rows going to 
> each specific vertex) is being counted in the input rows for both Reducer 2 
> and Reducer 8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16341) Tez Task Execution Summary has incorrect input record counts on some operators

2017-03-30 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-16341:
-


> Tez Task Execution Summary has incorrect input record counts on some operators
> --
>
> Key: HIVE-16341
> URL: https://issues.apache.org/jira/browse/HIVE-16341
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> {noformat}
> Task Execution Summary
> 
>   VERTICES  TOTAL_TASKS  FAILED_ATTEMPTS  KILLED_TASKS   DURATION(ms)  
> CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  OUTPUT_RECORDS
> 
>  Map 1  1670 0   17640.00 
> 2,109,200   23,068150,000,004  11,995,136
> Map 1150 0   10559.00
> 71,960  633  4,023,690 799,900
> Map 1310 02244.00 
> 6,090   29 25   3
>  Map 310 02849.00 
> 7,080   99 25   3
>  Map 5  2710 0   55834.00
> 12,934,890  358,376  1,500,000,001   1,500,000,161
>  Map 7  2410 0   91243.00 
> 5,020,860   71,182  1,827,250,341 652,413,443
> Reducer 1010 01010.00 
> 1,9000  4   0
> Reducer 1210 03854.00 
> 1,3200799,900   1
> Reducer 1410 01420.00 
> 3,790   45  3   1
>  Reducer 210 09720.00 
> 6,220  122 11,995,136   1
>  Reducer 410 0 810.00 
> 2,100  105  3   1
>  Reducer 610 0   24863.00 
> 3,2605  1,500,000,161   1
>  Reducer 8  4120 0   88215.00
> 17,106,440  184,524  2,165,208,640   1,864
>  Reducer 920 0   29752.00 
> 3,9800  1,864   4
> 
> {noformat}
> Seeing this on queries using runtime filtering. Noticed the INPUT_RECORDS 
> look incorrect for the reducers that are responsible for aggregating the 
> min/max/bloomfilter (Reducers 12, 14, 2, 6). For example Reducer 2 shows 12M 
> input records. However looking at the task logs for Reducer 2, there were 
> only 167 input records.
> It looks like Map 1 has 2 different output vertices (Reducer 2 and Reducer 
> 8), but the total output rows for Map 1 (rather than just the rows going to 
> each specific vertex) is being counted in the input rows for both Reducer 2 
> and Reducer 8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16311) Improve the performance for FastHiveDecimalImpl.fastDivide

2017-03-30 Thread Colin Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-16311:

Attachment: HIVE-16311.004.patch

> Improve the performance for FastHiveDecimalImpl.fastDivide
> --
>
> Key: HIVE-16311
> URL: https://issues.apache.org/jira/browse/HIVE-16311
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-16311.001.patch, HIVE-16311.002.patch, 
> HIVE-16311.003.patch, HIVE-16311.004.patch
>
>
> FastHiveDecimalImpl.fastDivide is poor performance when evaluate the 
> expression as 12345.67/123.45
> There are 2 points can be improved:
> 1. Don't always use HiveDecimal.MAX_SCALE as scale when do the 
> BigDecimal.divide.
> 2. Get the precision for BigInteger in a fast way if possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16277) Exchange Partition between filesystems throws "IllegalArgumentException Wrong FS"

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16277:

Attachment: HIVE-16277.2.patch

Updated patch to address test failures. Still need to do some re-factoring.

> Exchange Partition between filesystems throws "IllegalArgumentException Wrong 
> FS"
> -
>
> Key: HIVE-16277
> URL: https://issues.apache.org/jira/browse/HIVE-16277
> Project: Hive
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16277.1.patch, HIVE-16277.2.patch
>
>
> The following query: {{alter table s3_tbl exchange partition (country='USA') 
> with table hdfs_tbl}} fails with the following exception:
> {code}
> Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> java.lang.IllegalArgumentException Wrong FS: 
> s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:379)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:256)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:347)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:361)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong 
> FS: s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.exchangeTablePartitions(Hive.java:3553)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.exchangeTablePartition(DDLTask.java:4691)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:570)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2182)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1838)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1525)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1231)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:254)
>   ... 11 more
> Caused by: MetaException(message:Got exception: 
> java.lang.IllegalArgumentException Wrong FS: 
> s3a://[bucket]/table/country=USA, expected: file:///)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1387)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.renameDir(Warehouse.java:208)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.renameDir(Warehouse.java:200)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.exchange_partitions(HiveMetaStore.java:2967)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.exchange_partitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.exchange_partitions(HiveMetaStoreClient.java:690)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
>

[jira] [Commented] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950106#comment-15950106
 ] 

Vihang Karajgaonkar commented on HIVE-16299:


Linked the review url

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Attachment: HIVE-16299.02.patch

Updating the patch with a better implementation. The patch makes changes to the 
parallel file listing algorithm so that the directory structure which do not 
follow the partition key specs are not searched. This early exit strategy will 
also help improve query response time on slower filesystems like S3 and when 
partition directory structure does not conform to partition definitions. MSCK 
will throw exception or log a warning based on the value of 
{{hive.msck.path.validation}} configuration.

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch, HIVE-16299.02.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950097#comment-15950097
 ] 

Hive QA commented on HIVE-16329:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861318/HIVE-16329.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4474/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4474/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4474/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861318 - PreCommit-HIVE-Build

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch, HIVE-16329.2.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16340) Allow Kerberos + SSL connections to HMS

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16340:

Attachment: HIVE-16340.1.patch

Attaching initial patch to get some feedback from Hive QA. Unit tests are 
coming.

> Allow Kerberos + SSL connections to HMS
> ---
>
> Key: HIVE-16340
> URL: https://issues.apache.org/jira/browse/HIVE-16340
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16340.1.patch
>
>
> It should be possible to connect to HMS with Kerberos authentication and SSL 
> enabled, at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16340) Allow Kerberos + SSL connections to HMS

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16340:

Status: Patch Available  (was: Open)

> Allow Kerberos + SSL connections to HMS
> ---
>
> Key: HIVE-16340
> URL: https://issues.apache.org/jira/browse/HIVE-16340
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16340.1.patch
>
>
> It should be possible to connect to HMS with Kerberos authentication and SSL 
> enabled, at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16340) Allow Kerberos + SSL connections to HMS

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-16340:
---


> Allow Kerberos + SSL connections to HMS
> ---
>
> Key: HIVE-16340
> URL: https://issues.apache.org/jira/browse/HIVE-16340
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> It should be possible to connect to HMS with Kerberos authentication and SSL 
> enabled, at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12437) SMB join in tez fails when one of the tables is empty

2017-03-30 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950063#comment-15950063
 ] 

Vaibhav Gumashta commented on HIVE-12437:
-

Removing 1.2.2 as this is an improvement on top of HIVE-11356 which is being 
reverted from branch-1.2

> SMB join in tez fails when one of the tables is empty
> -
>
> Key: HIVE-12437
> URL: https://issues.apache.org/jira/browse/HIVE-12437
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12437.1.patch, HIVE-12437.2.patch
>
>
> It looks like a better check for empty tables is to depend on the existence 
> of the record reader for the input from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11356) SMB join on tez fails when one of the tables is empty

2017-03-30 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950062#comment-15950062
 ] 

Vaibhav Gumashta commented on HIVE-11356:
-

Removing 1.2.2 and reverting this from branch-1.2 as the test this patch adds 
fails on commit itself.

> SMB join on tez fails when one of the tables is empty
> -
>
> Key: HIVE-11356
> URL: https://issues.apache.org/jira/browse/HIVE-11356
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch, 
> HIVE-11356.4.patch, HIVE-11356.5.patch, HIVE-11356.6.patch
>
>
> {code}
> :java.lang.IllegalStateException: Unexpected event. All physical sources 
> already initialized 
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145) 
> at 
> org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673)
>  
> at java.lang.Thread.run(Thread.java:745) 
> ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
> vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] 
> Vertex killed, vertexName=Reducer 5, 
> vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill 
> while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, 
> Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] 
> DAG failed due to vertex failure. failedVertices:1 killedVertices:1 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask 
> HQL-FAILED 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-12437) SMB join in tez fails when one of the tables is empty

2017-03-30 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12437:

Target Version/s: 2.0.0, 1.3.0  (was: 1.3.0, 2.0.0, 1.2.2)

> SMB join in tez fails when one of the tables is empty
> -
>
> Key: HIVE-12437
> URL: https://issues.apache.org/jira/browse/HIVE-12437
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12437.1.patch, HIVE-12437.2.patch
>
>
> It looks like a better check for empty tables is to depend on the existence 
> of the record reader for the input from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-12437) SMB join in tez fails when one of the tables is empty

2017-03-30 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12437:

Fix Version/s: (was: 1.2.2)

> SMB join in tez fails when one of the tables is empty
> -
>
> Key: HIVE-12437
> URL: https://issues.apache.org/jira/browse/HIVE-12437
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Critical
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12437.1.patch, HIVE-12437.2.patch
>
>
> It looks like a better check for empty tables is to depend on the existence 
> of the record reader for the input from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-11356) SMB join on tez fails when one of the tables is empty

2017-03-30 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11356:

Fix Version/s: (was: 1.2.2)

> SMB join on tez fails when one of the tables is empty
> -
>
> Key: HIVE-11356
> URL: https://issues.apache.org/jira/browse/HIVE-11356
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch, 
> HIVE-11356.4.patch, HIVE-11356.5.patch, HIVE-11356.6.patch
>
>
> {code}
> :java.lang.IllegalStateException: Unexpected event. All physical sources 
> already initialized 
> at com.google.common.base.Preconditions.checkState(Preconditions.java:145) 
> at 
> org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90)
>  
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673)
>  
> at java.lang.Thread.run(Thread.java:745) 
> ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
> vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] 
> Vertex killed, vertexName=Reducer 5, 
> vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill 
> while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, 
> Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] 
> DAG failed due to vertex failure. failedVertices:1 killedVertices:1 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask 
> HQL-FAILED 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16339) QTestUtil pattern masking should only partially mask paths

2017-03-30 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950060#comment-15950060
 ] 

Sahil Takiar commented on HIVE-16339:
-

I have a bad feeling there are going to be a lot of qtest failures due to 
failed diffs. I still think this will be a useful feature though.

> QTestUtil pattern masking should only partially mask paths
> --
>
> Key: HIVE-16339
> URL: https://issues.apache.org/jira/browse/HIVE-16339
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16339.1.patch
>
>
> QTestUtil will mask an entire like in .q.out files if it sees any of the 
> target mask patterns. This seems unnecessary for patterns such as "pfile:", 
> "file:", and "hdfs:" which are targeted towards masking file paths.
> Just because a line in .q.out contains a path doesn't mean the entire line 
> should be masked. The line could contain useful information. It would be 
> better if just the file path could be masked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16339) QTestUtil pattern masking should only partially mask paths

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16339:

Status: Patch Available  (was: Open)

> QTestUtil pattern masking should only partially mask paths
> --
>
> Key: HIVE-16339
> URL: https://issues.apache.org/jira/browse/HIVE-16339
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16339.1.patch
>
>
> QTestUtil will mask an entire like in .q.out files if it sees any of the 
> target mask patterns. This seems unnecessary for patterns such as "pfile:", 
> "file:", and "hdfs:" which are targeted towards masking file paths.
> Just because a line in .q.out contains a path doesn't mean the entire line 
> should be masked. The line could contain useful information. It would be 
> better if just the file path could be masked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16339) QTestUtil pattern masking should only partially mask paths

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16339:

Attachment: HIVE-16339.1.patch

> QTestUtil pattern masking should only partially mask paths
> --
>
> Key: HIVE-16339
> URL: https://issues.apache.org/jira/browse/HIVE-16339
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16339.1.patch
>
>
> QTestUtil will mask an entire like in .q.out files if it sees any of the 
> target mask patterns. This seems unnecessary for patterns such as "pfile:", 
> "file:", and "hdfs:" which are targeted towards masking file paths.
> Just because a line in .q.out contains a path doesn't mean the entire line 
> should be masked. The line could contain useful information. It would be 
> better if just the file path could be masked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16339) QTestUtil pattern masking should only partially mask paths

2017-03-30 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-16339:
---


> QTestUtil pattern masking should only partially mask paths
> --
>
> Key: HIVE-16339
> URL: https://issues.apache.org/jira/browse/HIVE-16339
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> QTestUtil will mask an entire like in .q.out files if it sees any of the 
> target mask patterns. This seems unnecessary for patterns such as "pfile:", 
> "file:", and "hdfs:" which are targeted towards masking file paths.
> Just because a line in .q.out contains a path doesn't mean the entire line 
> should be masked. The line could contain useful information. It would be 
> better if just the file path could be masked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16296) use LLAP executor count to configure reducer auto-parallelism

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950056#comment-15950056
 ] 

Hive QA commented on HIVE-16296:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861317/HIVE-16296.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_binary_join_groupby]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_limit]
 (batchId=147)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4473/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4473/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4473/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861317 - PreCommit-HIVE-Build

> use LLAP executor count to configure reducer auto-parallelism
> -
>
> Key: HIVE-16296
> URL: https://issues.apache.org/jira/browse/HIVE-16296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16296.01.patch, HIVE-16296.03.patch, 
> HIVE-16296.04.patch, HIVE-16296.05.patch, HIVE-16296.2.patch, HIVE-16296.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15923) Hive default partition causes errors in get partitions

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949976#comment-15949976
 ] 

Hive QA commented on HIVE-15923:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861304/HIVE-15923.03.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4472/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4472/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4472/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861304 - PreCommit-HIVE-Build

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-15923.01.patch, HIVE-15923.02.patch, 
> HIVE-15923.03.patch, HIVE-15923.patch
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16228) Support subqueries in complex expression in SELECT clause

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16228:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> Support subqueries in complex expression in SELECT clause
> -
>
> Key: HIVE-16228
> URL: https://issues.apache.org/jira/browse/HIVE-16228
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-16091.2.patch, HIVE-16228.1.patch
>
>
> HIVE-16091 added support for subqueries in SELECT clause but restricted 
> subqueries to top level expressions (more detail is at [LINK | 
> https://cwiki.apache.org/confluence/display/Hive/Subqueries+in+SELECT])
> This restriction will be relaxed to allow subqueries in all kind of 
> expression except UDAF (including UDAs and UDTFs).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15515) Remove the docs directory

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15515:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Akira!

> Remove the docs directory
> -
>
> Key: HIVE-15515
> URL: https://issues.apache.org/jira/browse/HIVE-15515
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Lefty Leverenz
>Assignee: Akira Ajisaka
> Fix For: 3.0.0
>
> Attachments: HIVE-15515.01.patch, HIVE-15515.02.patch
>
>
> Hive xdocs have not been used since 2012.  The docs directory only holds six 
> xml documents, and their contents are in the wiki.
> It's past time to remove the docs directory from the Hive code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16228) Support subqueries in complex expression in SELECT clause

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949953#comment-15949953
 ] 

Ashutosh Chauhan commented on HIVE-16228:
-

+1

> Support subqueries in complex expression in SELECT clause
> -
>
> Key: HIVE-16228
> URL: https://issues.apache.org/jira/browse/HIVE-16228
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16091.2.patch, HIVE-16228.1.patch
>
>
> HIVE-16091 added support for subqueries in SELECT clause but restricted 
> subqueries to top level expressions (more detail is at [LINK | 
> https://cwiki.apache.org/confluence/display/Hive/Subqueries+in+SELECT])
> This restriction will be relaxed to allow subqueries in all kind of 
> expression except UDAF (including UDAs and UDTFs).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16335) Beeline user HS2 connection file should use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949945#comment-15949945
 ] 

Vihang Karajgaonkar commented on HIVE-16335:


[~aihuaxu] Can you please review?

> Beeline user HS2 connection file should use /etc/hive/conf instead of 
> /etc/conf/hive
> 
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16335.01.patch
>
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16174) Update MetricsConstant.WAITING_COMPILE_OPS metric when we aquire lock failed in Driver

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16174:

   Resolution: Fixed
Fix Version/s: (was: 2.2.0)
   3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Yunfei!

> Update MetricsConstant.WAITING_COMPILE_OPS metric when we aquire lock failed 
> in Driver
> --
>
> Key: HIVE-16174
> URL: https://issues.apache.org/jira/browse/HIVE-16174
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: yunfei liu
>Assignee: yunfei liu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16174.1.patch
>
>
> In Driver#compileInternal method (as the code snippet below ), we need to 
> update  MetricsConstant.WAITING_COMPILE_OPS metric correctly before return if 
> lock can not be acquired.
> {code}
> Metrics metrics = MetricsFactory.getInstance();
> if (metrics != null) {
>   metrics.incrementCounter(MetricsConstant.WAITING_COMPILE_OPS, 1);
> }
> final ReentrantLock compileLock = tryAcquireCompileLock(isParallelEnabled,
>   command);
> if (compileLock == null) {
>   return ErrorMsg.COMPILE_LOCK_TIMED_OUT.getErrorCode();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15515) Remove the docs directory

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949916#comment-15949916
 ] 

Ashutosh Chauhan commented on HIVE-15515:
-

+1 wiki indeed is authoritative source. 

> Remove the docs directory
> -
>
> Key: HIVE-15515
> URL: https://issues.apache.org/jira/browse/HIVE-15515
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Lefty Leverenz
>Assignee: Akira Ajisaka
> Attachments: HIVE-15515.01.patch, HIVE-15515.02.patch
>
>
> Hive xdocs have not been used since 2012.  The docs directory only holds six 
> xml documents, and their contents are in the wiki.
> It's past time to remove the docs directory from the Hive code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-1626) stop using java.util.Stack

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949894#comment-15949894
 ] 

Ashutosh Chauhan commented on HIVE-1626:


Patch needs a rebase.

> stop using java.util.Stack
> --
>
> Key: HIVE-1626
> URL: https://issues.apache.org/jira/browse/HIVE-1626
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Teddy Choi
> Attachments: HIVE-1626.2.patch, HIVE-1626.2.patch, HIVE-1626.3.patch, 
> HIVE-1626.3.patch, HIVE-1626.3.patch
>
>
> We currently use Stack as part of the generic node walking library.  Stack 
> should not be used for this since its inheritance from Vector incurs 
> superfluous synchronization overhead.
> Most projects end up adding an ArrayStack implementation and using that 
> instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949890#comment-15949890
 ] 

Vihang Karajgaonkar commented on HIVE-16299:


Thats a good point. I think we can use this information to exit early during 
the listing phase itself. If there are invalid partition directories, we don't 
need to list them and throw error or skip based on the value of 
{{hive.msck.path.validation}}.

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16335) Beeline user HS2 connection file should use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949893#comment-15949893
 ] 

Hive QA commented on HIVE-16335:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861299/HIVE-16335.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4471/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4471/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4471/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861299 - PreCommit-HIVE-Build

> Beeline user HS2 connection file should use /etc/hive/conf instead of 
> /etc/conf/hive
> 
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16335.01.patch
>
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16174) Update MetricsConstant.WAITING_COMPILE_OPS metric when we aquire lock failed in Driver

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949886#comment-15949886
 ] 

Ashutosh Chauhan commented on HIVE-16174:
-

+1

> Update MetricsConstant.WAITING_COMPILE_OPS metric when we aquire lock failed 
> in Driver
> --
>
> Key: HIVE-16174
> URL: https://issues.apache.org/jira/browse/HIVE-16174
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: yunfei liu
>Assignee: yunfei liu
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16174.1.patch
>
>
> In Driver#compileInternal method (as the code snippet below ), we need to 
> update  MetricsConstant.WAITING_COMPILE_OPS metric correctly before return if 
> lock can not be acquired.
> {code}
> Metrics metrics = MetricsFactory.getInstance();
> if (metrics != null) {
>   metrics.incrementCounter(MetricsConstant.WAITING_COMPILE_OPS, 1);
> }
> final ReentrantLock compileLock = tryAcquireCompileLock(isParallelEnabled,
>   command);
> if (compileLock == null) {
>   return ErrorMsg.COMPILE_LOCK_TIMED_OUT.getErrorCode();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949868#comment-15949868
 ] 

Ashutosh Chauhan commented on HIVE-16249:
-

+1

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949862#comment-15949862
 ] 

Ashutosh Chauhan commented on HIVE-16299:
-

It should use {{hive.msck.path.validation}} to throw in case user wants that 
behavior.

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16249:
---
Attachment: HIVE-16249.02.patch

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949856#comment-15949856
 ] 

Pengcheng Xiong commented on HIVE-16249:


a new patch is submitted after comments from [~ashutoshc] are addressed.

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16249:
---
Status: Open  (was: Patch Available)

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16249:
---
Status: Patch Available  (was: Open)

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch, HIVE-16249.02.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949835#comment-15949835
 ] 

Pengcheng Xiong commented on HIVE-16301:


True, that is why i asked u to create 16316. We need to push 16316 to master as 
well, but not 2.3.

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16061) When hive.async.log.enabled is set to true, some output is not printed to the beeline console

2017-03-30 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949831#comment-15949831
 ] 

Aihua Xu commented on HIVE-16061:
-

patch-4: Address the comments and fix the unit test failures.

from the test, I noticed that before switching operation log level (execution, 
verbose ...) in runtime would also change change the layout format, not only 
changing the output content. But now we can't support that since the layout is 
initialized during the appender initialization, but it's the same for the 
logging (console, or file logging), so I guess it's OK.




> When hive.async.log.enabled is set to true, some output is not printed to the 
> beeline console
> -
>
> Key: HIVE-16061
> URL: https://issues.apache.org/jira/browse/HIVE-16061
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16061.1.patch, HIVE-16061.2.patch, 
> HIVE-16061.3.patch, HIVE-16061.4.patch
>
>
> Run a hiveserver2 instance "hive --service hiveserver2".
> Then from another console, connect to hiveserver2 "beeline -u 
> "jdbc:hive2://localhost:1"
> When you run a MR job like "select t1.key from src t1 join src t2 on 
> t1.key=t2.key", some of the console logs like MR job info are not printed to 
> the console while it just print to the hiveserver2 console.
> When hive.async.log.enabled is set to false and restarts the HiveServer2, 
> then the output will be printed to the beeline console.
> OperationLog implementation uses the ThreadLocal variable to store associated 
> the log file. When the hive.async.log.enabled is set to true, the logs will 
> be processed by a ThreadPool and  the actual threads from the pool which 
> prints the message won't be able to access the log file stored in the 
> original thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949830#comment-15949830
 ] 

Naveen Gangam commented on HIVE-16301:
--

[~pxiong] Thanks for the commit. Quick question though. Its my understanding 
that master should be 3.0.0 release, not 2.3.0. Am I wrong? Thanks

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: storage-2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Affects Version/s: (was: storage-2.2.0)
   2.2.0

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Attachment: HIVE-16299.01.patch

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16299.01.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16061) When hive.async.log.enabled is set to true, some output is not printed to the beeline console

2017-03-30 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16061:

Attachment: HIVE-16061.4.patch

> When hive.async.log.enabled is set to true, some output is not printed to the 
> beeline console
> -
>
> Key: HIVE-16061
> URL: https://issues.apache.org/jira/browse/HIVE-16061
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16061.1.patch, HIVE-16061.2.patch, 
> HIVE-16061.3.patch, HIVE-16061.4.patch
>
>
> Run a hiveserver2 instance "hive --service hiveserver2".
> Then from another console, connect to hiveserver2 "beeline -u 
> "jdbc:hive2://localhost:1"
> When you run a MR job like "select t1.key from src t1 join src t2 on 
> t1.key=t2.key", some of the console logs like MR job info are not printed to 
> the console while it just print to the hiveserver2 console.
> When hive.async.log.enabled is set to false and restarts the HiveServer2, 
> then the output will be printed to the beeline console.
> OperationLog implementation uses the ThreadLocal variable to store associated 
> the log file. When the hive.async.log.enabled is set to true, the logs will 
> be processed by a ThreadPool and  the actual threads from the pool which 
> prints the message won't be able to access the log file stored in the 
> original thread. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16130) Remove jackson classes from hive-jdbc standalone jar

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16130:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Tao!

> Remove jackson classes from hive-jdbc standalone jar
> 
>
> Key: HIVE-16130
> URL: https://issues.apache.org/jira/browse/HIVE-16130
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-16130.1.patch, HIVE-16130.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15829) LLAP text cache: disable memory tracking on the writer

2017-03-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949765#comment-15949765
 ] 

Sergey Shelukhin commented on HIVE-15829:
-

Looks like orc is still a module in 2.2, so not needed for 2.2.
[~owen.omalley] please correct me if it's going to change...

> LLAP text cache: disable memory tracking on the writer
> --
>
> Key: HIVE-15829
> URL: https://issues.apache.org/jira/browse/HIVE-15829
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-15829.patch
>
>
> See ORC-141 and HIVE-15672 for context



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16336) Rename hive.spark.use.file.size.for.mapjoin to hive.spark.use.ts.stats.for.mapjoin

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949763#comment-15949763
 ] 

Hive QA commented on HIVE-16336:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861291/HIVE-16336.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4470/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4470/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4470/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861291 - PreCommit-HIVE-Build

> Rename hive.spark.use.file.size.for.mapjoin to 
> hive.spark.use.ts.stats.for.mapjoin
> --
>
> Key: HIVE-16336
> URL: https://issues.apache.org/jira/browse/HIVE-16336
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-16336.0.patch
>
>
> The name {{hive.spark.use.file.size.for.mapjoin}} is confusing. It indicates 
> that HoS uses file size for mapjoin but in fact it still uses (in-memory) 
> data size. We should change it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15829) LLAP text cache: disable memory tracking on the writer

2017-03-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15829:

Target Version/s: 2.1.2, 2.3.0  (was: 2.1.2, 2.2.0, 2.3.0)

> LLAP text cache: disable memory tracking on the writer
> --
>
> Key: HIVE-15829
> URL: https://issues.apache.org/jira/browse/HIVE-15829
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-15829.patch
>
>
> See ORC-141 and HIVE-15672 for context



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16130) Remove jackson classes from hive-jdbc standalone jar

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949761#comment-15949761
 ] 

Ashutosh Chauhan commented on HIVE-16130:
-

Is this ready for commit ?

> Remove jackson classes from hive-jdbc standalone jar
> 
>
> Key: HIVE-16130
> URL: https://issues.apache.org/jira/browse/HIVE-16130
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-16130.1.patch, HIVE-16130.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16214) Explore the possibillity of introducing a service-client module

2017-03-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16214:

Target Version/s: 3.0.0
 Component/s: Metastore
  JDBC

[~kgyrtkirk] Is this ready for review ?

> Explore the possibillity of introducing a service-client module
> ---
>
> Key: HIVE-16214
> URL: https://issues.apache.org/jira/browse/HIVE-16214
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Metastore
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16214.1.patch, HIVE-16214.2.patch
>
>
> The jdbc driver pulls in a lot of things from hive...and that may affect the 
> jdbc driver user.
> In this ticket I experiment with the extraction of the relevant parts of 
> service(wrt to the jdbc driver) into a service-client module.
> I've opened a PR...to enable commit by commit view:
> https://github.com/apache/hive/pull/158



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16307) add IO memory usage report to LLAP UI

2017-03-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16307:

Attachment: HIVE-16307.01.patch

Updated the patch based on my testing.
[~sseth] [~prasanth_j] [~gopalv] low stakes patch, simple to review :)

> add IO memory usage report to LLAP UI
> -
>
> Key: HIVE-16307
> URL: https://issues.apache.org/jira/browse/HIVE-16307
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16307.01.patch, HIVE-16307.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16206) Make Codahale metrics reporters pluggable

2017-03-30 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949727#comment-15949727
 ] 

Carl Steinbach commented on HIVE-16206:
---

+1.

> Make Codahale metrics reporters pluggable
> -
>
> Key: HIVE-16206
> URL: https://issues.apache.org/jira/browse/HIVE-16206
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.2
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16206.2.patch, HIVE-16206.3.patch, 
> HIVE-16206.4.patch, HIVE-16206.5.patch, HIVE-16206.6.patch, HIVE-16206.patch
>
>
> Hive metrics code currently allows pluggable metrics handlers - ie, handlers 
> that take care of providing interfaces for metrics collection as well as a 
> reporting; one of the 'handlers' is CodahaleMetrics. Codahale can work with 
> different reporters - currently supported ones are Console, JMX, JSON file 
> and hadoop2 sink. However, adding a new reporter involves changing that 
> class. We would like to make this conf driven just the way MetricsFactory 
> handles configurable Metrics classes.
> Scope of work:
> - Provide a new configuration option, HIVE_CODAHALE_REPORTER_CLASSES that 
> enumerates classes (like HIVE_METRICS_CLASS and unlike HIVE_METRICS_REPORTER).
> - Move JsonFileReporter into its own class.
> - Update CodahaleMetrics.java to read new config option and if the new option 
> is not present, look for the old option and instantiate accordingly) - ie, 
> make the code backward compatible.
> - Update and add new tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16299) MSCK REPAIR TABLE should enforce partition key order when adding unknown partitions

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16299:
---
Summary: MSCK REPAIR TABLE should enforce partition key order when adding 
unknown partitions  (was: In case of partitioned table, MSCK REPAIR TABLE does 
not do a full validation of a FS paths and in result create false partitions 
and directories)

> MSCK REPAIR TABLE should enforce partition key order when adding unknown 
> partitions
> ---
>
> Key: HIVE-16299
> URL: https://issues.apache.org/jira/browse/HIVE-16299
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: storage-2.2.0
>Reporter: Dudu Markovitz
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
> static String getPartitionName(Path tablePath, Path partitionPath, 
> Set partCols)
> 
> MSCK REPAIR validates that any sub-directory is in the format col=val and 
> that there is indeed a partition column named "col".
> However, there is no validation of the partition column location and as a 
> result false partitions are being created and so are directories that match 
> those partitions. 
> e.g. 1
> hive> dfs -mkdir -p /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5;
> hive> create external table t (i int) partitioned by (a int,b int,c int) ;
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:a=1/a=2/a=3/b=4/c=5
> Repair: Added partition to metastore t:a=1/a=2/a=3/b=4/c=5
> Time taken: 0.563 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=3/b=4/c=5
> hive> dfs -ls -R /user/hive/warehouse/t;
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=1/a=2/a=3/b=4/c=5
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:07 
> /user/hive/warehouse/t/a=3/b=4/c=5
> e.g. 2
> hive> dfs -mkdir -p /user/hive/warehouse/t/c=3/b=2/a=1;
> hive> create external table t (i int) partitioned by (a int,b int,c int);
> OK
> hive> msck repair table t;
> OK
> Partitions not in metastore:  t:c=3/b=2/a=1
> Repair: Added partition to metastore t:c=3/b=2/a=1
> Time taken: 0.512 seconds, Fetched: 2 row(s)
> hive> show partitions t;
> OK
> a=1/b=2/c=3
> hive> dfs -ls -R  /user/hive/warehouse/t;
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2
> drwxrwxrwx   - cloudera supergroup  0 2017-03-26 13:13 
> /user/hive/warehouse/t/a=1/b=2/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2
> drwxr-xr-x   - cloudera supergroup  0 2017-03-26 13:12 
> /user/hive/warehouse/t/c=3/b=2/a=1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16329:
---
Attachment: HIVE-16329.2.patch

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch, HIVE-16329.2.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16329:
---
Attachment: (was: HIVE-16329.2.patch)

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch, HIVE-16329.2.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16296) use LLAP executor count to configure reducer auto-parallelism

2017-03-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16296:

Attachment: HIVE-16296.05.patch

A couple more updates

> use LLAP executor count to configure reducer auto-parallelism
> -
>
> Key: HIVE-16296
> URL: https://issues.apache.org/jira/browse/HIVE-16296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16296.01.patch, HIVE-16296.03.patch, 
> HIVE-16296.04.patch, HIVE-16296.05.patch, HIVE-16296.2.patch, HIVE-16296.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16329:
---
Attachment: HIVE-16329.2.patch

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch, HIVE-16329.2.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16297) Improving hive logging configuration variables

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949676#comment-15949676
 ] 

Hive QA commented on HIVE-16297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861292/HIVE-16297.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10541 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=141)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4469/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4469/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4469/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861292 - PreCommit-HIVE-Build

> Improving hive logging configuration variables
> --
>
> Key: HIVE-16297
> URL: https://issues.apache.org/jira/browse/HIVE-16297
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16297.01.patch, HIVE-16297.02.patch
>
>
> There are a few places in the source-code where we use 
> {{Configuration.dumpConfiguration()}}. We should preprocess the configuration 
> properties before dumping it in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper

2017-03-30 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949655#comment-15949655
 ] 

Eugene Koifman commented on HIVE-16334:
---

+1 
https://issues.apache.org/jira/browse/HIVE-16334?focusedCommentId=15949585=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15949585

> Query lock contains the query string, which can cause OOM on ZooKeeper
> --
>
> Key: HIVE-16334
> URL: https://issues.apache.org/jira/browse/HIVE-16334
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16334.patch
>
>
> When there are big number of partitions in a query this will result in a huge 
> number of locks on ZooKeeper. Since the query object contains the whole query 
> string this might cause serious memory pressure on the ZooKeeper services.
> It would be good to have the possibility to truncate the query strings that 
> are written into the locks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15923) Hive default partition causes errors in get partitions

2017-03-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15923:

Attachment: HIVE-15923.03.patch

Got rid of it altogether, it's possible to just use isnull/isnotnull

> Hive default partition causes errors in get partitions
> --
>
> Key: HIVE-15923
> URL: https://issues.apache.org/jira/browse/HIVE-15923
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-15923.01.patch, HIVE-15923.02.patch, 
> HIVE-15923.03.patch, HIVE-15923.patch
>
>
> This is the ORM error, direct SQL fails too before that, with a similar error.
> {noformat}
> 2017-02-14T17:45:11,158 ERROR [09fdd887-0164-4f55-97e9-4ba147d962be main] 
> metastore.ObjectStore:java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.plan.ExprNodeConstantDefaultDesc cannot be cast to 
> java.lang.Long
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:801)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(P
> rimitiveObjectInspectorConverter.java:240) 
> ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan.evaluate(GenericUDFOPEqualOrGreaterThan.java:145)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBetween.evaluate(GenericUDFBetween.java:57)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:63)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.evaluateExprOnPart(PartExprEvalUtils.java:126)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16249) With column stats, mergejoin.q throws NPE

2017-03-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949640#comment-15949640
 ] 

Ashutosh Chauhan commented on HIVE-16249:
-

I see. true or false is indeed valid, but not any other literals. Can you 
modify your code to do:
{code}
 if (literal.isAlwaysFalse()) {
  return 0.0;
} else if (literal.isAlwaysTrue()){
  return 1.0;
} else {
assert false;
}
{code}

> With column stats, mergejoin.q throws NPE
> -
>
> Key: HIVE-16249
> URL: https://issues.apache.org/jira/browse/HIVE-16249
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16249.01.patch
>
>
> stack trace:
> {code}
> 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.NullPointerException
> at 
> org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216)
>  ~[calcite-core-1.10.0.jar:1.10.0]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16335) Beeline user HS2 connection file should use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16335:
---
Affects Version/s: 2.2.0
   Status: Patch Available  (was: Open)

> Beeline user HS2 connection file should use /etc/hive/conf instead of 
> /etc/conf/hive
> 
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16335.01.patch
>
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16335) Beeline user HS2 connection file should use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16335:
---
Attachment: HIVE-16335.01.patch

Attaching a simple patch for this fix.

> Beeline user HS2 connection file should use /etc/hive/conf instead of 
> /etc/conf/hive
> 
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16335.01.patch
>
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16335) Beeline user HS2 connection file should be use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16335:
---
Summary: Beeline user HS2 connection file should be use /etc/hive/conf 
instead of /etc/conf/hive  (was: documentation error in )

> Beeline user HS2 connection file should be use /etc/hive/conf instead of 
> /etc/conf/hive
> ---
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16335) Beeline user HS2 connection file should use /etc/hive/conf instead of /etc/conf/hive

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16335:
---
Summary: Beeline user HS2 connection file should use /etc/hive/conf instead 
of /etc/conf/hive  (was: Beeline user HS2 connection file should be use 
/etc/hive/conf instead of /etc/conf/hive)

> Beeline user HS2 connection file should use /etc/hive/conf instead of 
> /etc/conf/hive
> 
>
> Key: HIVE-16335
> URL: https://issues.apache.org/jira/browse/HIVE-16335
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1
>Reporter: Tim Harsch
>Assignee: Vihang Karajgaonkar
>
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
> says:  
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/conf/hive in that 
> order.
> shouldn't it be?
> BeeLine looks for it in ${HIVE_CONF_DIR} location and /etc/hive/conf in that 
> order?
> Most distributions I've used have a /etc/hive/conf dir.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper

2017-03-30 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949585#comment-15949585
 ] 

Sahil Takiar commented on HIVE-16334:
-

Why does the query string need to be written to ZK in the first place? Wouldn't 
it be sufficient to just use some query-id.

> Query lock contains the query string, which can cause OOM on ZooKeeper
> --
>
> Key: HIVE-16334
> URL: https://issues.apache.org/jira/browse/HIVE-16334
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16334.patch
>
>
> When there are big number of partitions in a query this will result in a huge 
> number of locks on ZooKeeper. Since the query object contains the whole query 
> string this might cause serious memory pressure on the ZooKeeper services.
> It would be good to have the possibility to truncate the query strings that 
> are written into the locks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16334) Query lock contains the query string, which can cause OOM on ZooKeeper

2017-03-30 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949586#comment-15949586
 ] 

Hive QA commented on HIVE-16334:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12861288/HIVE-16334.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10541 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=172)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4468/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4468/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4468/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12861288 - PreCommit-HIVE-Build

> Query lock contains the query string, which can cause OOM on ZooKeeper
> --
>
> Key: HIVE-16334
> URL: https://issues.apache.org/jira/browse/HIVE-16334
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16334.patch
>
>
> When there are big number of partitions in a query this will result in a huge 
> number of locks on ZooKeeper. Since the query object contains the whole query 
> string this might cause serious memory pressure on the ZooKeeper services.
> It would be good to have the possibility to truncate the query strings that 
> are written into the locks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949568#comment-15949568
 ] 

Gopal V commented on HIVE-16329:


[~sershe]: That's a good idea, because this currently breaks in hybrid mode.

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949551#comment-15949551
 ] 

Sergey Shelukhin commented on HIVE-16329:
-

On the patch itself - where isLlap flag is used, should it access it via the 
LlapDaemonInfo static, and not from wherever it get this now?

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16329) TopN: TopNHash totalFreeMemory computation ignores free memory

2017-03-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949545#comment-15949545
 ] 

Sergey Shelukhin commented on HIVE-16329:
-

Seems related:
{noformat}
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:354)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
... 15 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.llap.LlapDaemonInfo.getNumExecutors(LlapDaemonInfo.java:59)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:407)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:366)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:556)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:508)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:501)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:317)
{noformat}

> TopN: TopNHash totalFreeMemory computation ignores free memory
> --
>
> Key: HIVE-16329
> URL: https://issues.apache.org/jira/browse/HIVE-16329
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16329.1.patch
>
>
> {code}
>   // TODO: For LLAP, assumption is off-heap cache.
>   final long memoryUsedPerExecutor = 
> (memoryMXBean.getHeapMemoryUsage().getUsed() / numExecutors);
>   // this is total free memory available per executor in case of LLAP
>   totalFreeMemory = conf.getMaxMemoryAvailable() - memoryUsedPerExecutor;
> {code}
> {code}
> exec.TopNHash: isTez parameters -615768144 = 5312782848 - 71142611912 / 12
> {code}
> This turns off the TopNHash entirely causing something trivial like 
> {code}
> select c_custkey, count(1) from customer group by c_custkey limit 10;
> {code}
> To shuffle 30M rows instead of 10.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16301:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949546#comment-15949546
 ] 

Pengcheng Xiong commented on HIVE-16301:


pushed to master and 2.3 thanks [~ngangam] for the work!

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16301:
---
Fix Version/s: 2.3.0

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16297) Improving hive logging configuration variables

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16297:
---
Attachment: HIVE-16297.02.patch

All good suggestions [~pvary]. Incorporated most of those, except that we 
cannot re-use HiveConfUtil.dumpConfiguration() in RemoteHiveSparkClient and 
FileSinkOperator because they are dumping JobConf object not HiveConf object.

> Improving hive logging configuration variables
> --
>
> Key: HIVE-16297
> URL: https://issues.apache.org/jira/browse/HIVE-16297
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16297.01.patch, HIVE-16297.02.patch
>
>
> There are a few places in the source-code where we use 
> {{Configuration.dumpConfiguration()}}. We should preprocess the configuration 
> properties before dumping it in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16301) Preparing for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16301:
---
Summary: Preparing for 2.3 development.  (was: Prepare branch-2 for 2.3 
development.)

> Preparing for 2.3 development.
> --
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16301) Prepare branch-2 for 2.3 development.

2017-03-30 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16301:
--

Assignee: Pengcheng Xiong  (was: Naveen Gangam)

> Prepare branch-2 for 2.3 development.
> -
>
> Key: HIVE-16301
> URL: https://issues.apache.org/jira/browse/HIVE-16301
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
>Reporter: Naveen Gangam
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Attachments: HIVE-16301.2.patch, HIVE-16301.3.patch, HIVE-16301.patch
>
>
> branch-2 is now being used for 2.3.0 development. The build files will need 
> to reflect this change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15880) Allow insert overwrite and truncate table query to use auto.purge table property

2017-03-30 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949540#comment-15949540
 ] 

Vihang Karajgaonkar commented on HIVE-15880:


Hi [~ctang.ma] Can you please review? Thanks!

> Allow insert overwrite and truncate table query to use auto.purge table 
> property
> 
>
> Key: HIVE-15880
> URL: https://issues.apache.org/jira/browse/HIVE-15880
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15880.01.patch, HIVE-15880.02.patch, 
> HIVE-15880.03.patch, HIVE-15880.04.patch, HIVE-15880.05.patch, 
> HIVE-15880.06.patch
>
>
> It seems inconsistent that auto.purge property is not considered when we do a 
> INSERT OVERWRITE while it is when we do a DROP TABLE
> Drop table doesn't move table data to Trash when auto.purge is set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> No rows affected (0.064 seconds)
> > alter table temp set tblproperties('auto.purge'='true');
> No rows affected (0.083 seconds)
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> No rows affected (25.473 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:03 
> /user/hive/warehouse/temp/00_0
> #
> > drop table temp;
> No rows affected (0.242 seconds)
> # hdfs dfs -ls /user/hive/warehouse/temp
> ls: `/user/hive/warehouse/temp': No such file or directory
> #
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> #
> {noformat}
> INSERT OVERWRITE query moves the table data to Trash even when auto.purge is 
> set to true
> {noformat}
> > create table temp(col1 string, col2 string);
> > alter table temp set tblproperties('auto.purge'='true');
> > insert into temp values ('test', 'test'), ('test2', 'test2');
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 22 2017-02-09 13:07 
> /user/hive/warehouse/temp/00_0
> #
> > insert overwrite table temp select * from dummy;
> # hdfs dfs -ls /user/hive/warehouse/temp
> Found 1 items
> -rwxrwxrwt   3 hive hive 26 2017-02-09 13:08 
> /user/hive/warehouse/temp/00_0
> # sudo -u hive hdfs dfs -ls /user/hive/.Trash/Current/user/hive/warehouse
> Found 1 items
> drwx--   - hive hive  0 2017-02-09 13:08 
> /user/hive/.Trash/Current/user/hive/warehouse/temp
> #
> {noformat}
> While move operations are not very costly on HDFS it could be significant 
> overhead on slow FileSystems like S3. This could improve the performance of 
> {{INSERT OVERWRITE TABLE}} queries especially when there are large number of 
> partitions on tables located on S3 should the user wish to set auto.purge 
> property to true
> Similarly {{TRUNCATE TABLE}} query on a table with {{auto.purge}} property 
> set true should not move the data to Trash



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

1 2 >

1 - 100 of 175 matches

Mail list logo