[jira] [Commented] (HIVE-16518) Insert override for druid does not replace all existing segments

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988276#comment-15988276
 ] 

Hive QA commented on HIVE-16518:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865141/HIVE-16518.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4909/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4909/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4909/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865141 - PreCommit-HIVE-Build

> Insert override for druid does not replace all existing segments
> 
>
> Key: HIVE-16518
> URL: https://issues.apache.org/jira/browse/HIVE-16518
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Fix For: 3.0.0
>
> Attachments: HIVE-16518.patch
>
>
> Insert override for Druid does not replace segments for all intervals. 
> It just replaces segments for the intervals which are newly ingested. 
> INSERT OVERRIDE TABLE statement on DruidStorageHandler should override all 
> existing segments for the table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2017-04-27 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16484:

Attachment: HIVE-16484.6.patch

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, HIVE-16484.6.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16513) width_bucket issues

2017-04-27 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988247#comment-15988247
 ] 

Sahil Takiar commented on HIVE-16513:
-

Updated patch, addressed comments on the RB, some significant changes to the 
approach:

* Using {{FunctionRegistry.getCommonClassForComparison}} to find a common 
{{TypeInfo}} for {{minValue}}, {{maxValue}}, {{expr}} so that they can all be 
compared properly
** Once the proper {{TypeInfo}} is determined, 
{{ObjectInspectorConverters.getConverter}} is used to convert each argument to 
the correct type
** A {{switch...case}} statement is then used based on the common {{TypeInfo}}
* I changed the approach of the core algorithm for calculating the correct 
bucket. The old algorithm relied on the modulus operator, which had issues when 
run against doubles; the new algorithm is based on the approach taken by 
Postgres
* Added a lot more tests to the qfile

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-16513.1.patch, HIVE-16513.2.patch
>
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16488:

Status: Patch Available  (was: In Progress)

> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch, HIVE-16488.02.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16488:

Attachment: HIVE-16488.02.patch

Added 02.patch with updates in test code to expect failure of REPL LOAD command 
execution if the DB is not empty.

> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch, HIVE-16488.02.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16513) width_bucket issues

2017-04-27 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16513:

Attachment: HIVE-16513.2.patch

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-16513.1.patch, HIVE-16513.2.patch
>
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16488:

Affects Version/s: (was: 2.2.0)
   2.1.0

> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16488 started by Sankar Hariappan.
---
> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16488:

Status: Open  (was: Patch Available)

> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16518) Insert override for druid does not replace all existing segments

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988223#comment-15988223
 ] 

Hive QA commented on HIVE-16518:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865141/HIVE-16518.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10633 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_2] (batchId=11)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning 
(batchId=284)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4908/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4908/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4908/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865141 - PreCommit-HIVE-Build

> Insert override for druid does not replace all existing segments
> 
>
> Key: HIVE-16518
> URL: https://issues.apache.org/jira/browse/HIVE-16518
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
> Fix For: 3.0.0
>
> Attachments: HIVE-16518.patch
>
>
> Insert override for Druid does not replace segments for all intervals. 
> It just replaces segments for the intervals which are newly ingested. 
> INSERT OVERRIDE TABLE statement on DruidStorageHandler should override all 
> existing segments for the table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15642) Replicate Insert Overwrites, Dynamic Partition Inserts and Loads

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-15642:

Status: Open  (was: Patch Available)

> Replicate Insert Overwrites, Dynamic Partition Inserts and Loads
> 
>
> Key: HIVE-15642
> URL: https://issues.apache.org/jira/browse/HIVE-15642
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Sankar Hariappan
> Attachments: HIVE-15642.1.patch
>
>
> 1. Insert Overwrites to a new partition should not capture new files as part 
> of insert event but instead use the subsequent add partition event to capture 
> the files + checksums.
> 2. Insert Overwrites to an existing partition should capture new files as 
> part of the insert event. 
> Similar behaviour for DP inserts and loads.
> This will need changes from HIVE-15478



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15642) Replicate Insert Overwrites, Dynamic Partition Inserts and Loads

2017-04-27 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-15642:
---

Assignee: Sankar Hariappan  (was: Vaibhav Gumashta)

> Replicate Insert Overwrites, Dynamic Partition Inserts and Loads
> 
>
> Key: HIVE-15642
> URL: https://issues.apache.org/jira/browse/HIVE-15642
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Sankar Hariappan
> Attachments: HIVE-15642.1.patch
>
>
> 1. Insert Overwrites to a new partition should not capture new files as part 
> of insert event but instead use the subsequent add partition event to capture 
> the files + checksums.
> 2. Insert Overwrites to an existing partition should capture new files as 
> part of the insert event. 
> Similar behaviour for DP inserts and loads.
> This will need changes from HIVE-15478



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15642) Replicate Insert Overwrites, Dynamic Partition Inserts and Loads

2017-04-27 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988217#comment-15988217
 ] 

Sankar Hariappan commented on HIVE-15642:
-

[~vgumashta], I'll take up this JIRA to rebase the patch against the master 
code and will add up test cases.
cc [~sushanth], [~thejas]

> Replicate Insert Overwrites, Dynamic Partition Inserts and Loads
> 
>
> Key: HIVE-15642
> URL: https://issues.apache.org/jira/browse/HIVE-15642
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15642.1.patch
>
>
> 1. Insert Overwrites to a new partition should not capture new files as part 
> of insert event but instead use the subsequent add partition event to capture 
> the files + checksums.
> 2. Insert Overwrites to an existing partition should capture new files as 
> part of the insert event. 
> Similar behaviour for DP inserts and loads.
> This will need changes from HIVE-15478



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16546) LLAP: Fail map join tasks if hash table memory exceeds threshold

2017-04-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16546:
-
Attachment: (was: HIVE-16546.2.patch)

> LLAP: Fail map join tasks if hash table memory exceeds threshold
> 
>
> Key: HIVE-16546
> URL: https://issues.apache.org/jira/browse/HIVE-16546
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16546.1.patch, HIVE-16546.2.patch, 
> HIVE-16546.WIP.patch
>
>
> When map join task is running in llap, it can potentially use lot more memory 
> than its limit which could be memory per executor or no conditional task 
> size. If it uses more memory, it can adversely affect other query performance 
> or it can even bring down the daemon. In such cases, it is better to fail the 
> query than to bring down the daemon. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16546) LLAP: Fail map join tasks if hash table memory exceeds threshold

2017-04-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16546:
-
Attachment: HIVE-16546.2.patch

> LLAP: Fail map join tasks if hash table memory exceeds threshold
> 
>
> Key: HIVE-16546
> URL: https://issues.apache.org/jira/browse/HIVE-16546
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16546.1.patch, HIVE-16546.2.patch, 
> HIVE-16546.WIP.patch
>
>
> When map join task is running in llap, it can potentially use lot more memory 
> than its limit which could be memory per executor or no conditional task 
> size. If it uses more memory, it can adversely affect other query performance 
> or it can even bring down the daemon. In such cases, it is better to fail the 
> query than to bring down the daemon. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-04-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988193#comment-15988193
 ] 

Xuefu Zhang commented on HIVE-15997:


[~ychena], Thanks for the explanation. I wasn't questioning the code change. 
Rather, I was trying to understand whether the solution is complete. That is, 
is it possible to leak resources with the patch when a query is cancelled. You 
seemed suggesting the second check is nice to have, but I really like to know 
what part of code change fixed the root cause of the resource leak problem. Any 
further thoughts? Thanks.

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 2.2.0
>
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  
> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  
> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>  
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>  
> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>  
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: java.nio.channels.ClosedByInterruptException 
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>  
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) 
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>  
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) 
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) 
> at org.apache.hadoop.ipc.Client$Connectio

[jira] [Commented] (HIVE-16548) LLAP: EncodedReaderImpl.addOneCompressionBuffer throws NPE

2017-04-27 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988194#comment-15988194
 ] 

Rajesh Balamohan commented on HIVE-16548:
-

This happened as a part of regular data population (create table blah ..select) 
script. Only one query was executed and this happened in couple of queries. Log 
file is really large to be uploaded here. Will check on trimming it.

> LLAP: EncodedReaderImpl.addOneCompressionBuffer throws NPE
> --
>
> Key: HIVE-16548
> URL: https://issues.apache.org/jira/browse/HIVE-16548
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>
> Env: Based on apr-25 apache master codebase.
> {noformat}
> Caused by: java.io.IOException: java.lang.IllegalArgumentException: Buffer 
> size too small. size = 65536 needed = 3762509
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:695)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:454)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:420)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:242)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:239)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> Caused by: java.lang.IllegalArgumentException: Buffer size too small. size = 
> 65536 needed = 3762509
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.addOneCompressionBuffer(EncodedReaderImpl.java:1223)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:813)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:685)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:695)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:454)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:420)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:242)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:239)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> Caused by: java.lang.NullPointerException
> at 

[jira] [Updated] (HIVE-16171) Support replication of truncate table

2017-04-27 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-16171:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> Support replication of truncate table
> -
>
> Key: HIVE-16171
> URL: https://issues.apache.org/jira/browse/HIVE-16171
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR
> Fix For: 3.0.0
>
> Attachments: HIVE-16171.01.patch, HIVE-16171.02.patch, 
> HIVE-16171.03.patch, HIVE-16171.04.patch, HIVE-16171.05.patch, 
> HIVE-16171.06.patch, HIVE-16171.07.patch
>
>
> Need to support truncate table for replication. Key points to note.
> 1. For non-partitioned table, truncate table will remove all the rows from 
> the table.
> 2. For partitioned tables, need to consider how truncate behaves if truncate 
> a partition or the whole table.
> 3. Bootstrap load with truncate table must work as it is just 
> loadTable/loadPartition with empty dataset.
> 4. It is suggested to re-use the alter table/alter partition events to handle 
> truncate.
> 5. Need to consider the case where insert event happens before truncate table 
> which needs to see their data files through change management. The data files 
> should be recycled to the cmroot path before trashing it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16171) Support replication of truncate table

2017-04-27 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988182#comment-15988182
 ] 

Sushanth Sowmyan commented on HIVE-16171:
-

Ignoring the test failures as irrelevant.

+1, Patch looks good to me. Thanks for the many updates, Sankar.

> Support replication of truncate table
> -
>
> Key: HIVE-16171
> URL: https://issues.apache.org/jira/browse/HIVE-16171
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR
> Attachments: HIVE-16171.01.patch, HIVE-16171.02.patch, 
> HIVE-16171.03.patch, HIVE-16171.04.patch, HIVE-16171.05.patch, 
> HIVE-16171.06.patch, HIVE-16171.07.patch
>
>
> Need to support truncate table for replication. Key points to note.
> 1. For non-partitioned table, truncate table will remove all the rows from 
> the table.
> 2. For partitioned tables, need to consider how truncate behaves if truncate 
> a partition or the whole table.
> 3. Bootstrap load with truncate table must work as it is just 
> loadTable/loadPartition with empty dataset.
> 4. It is suggested to re-use the alter table/alter partition events to handle 
> truncate.
> 5. Need to consider the case where insert event happens before truncate table 
> which needs to see their data files through change management. The data files 
> should be recycled to the cmroot path before trashing it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16546) LLAP: Fail map join tasks if hash table memory exceeds threshold

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988180#comment-15988180
 ] 

Hive QA commented on HIVE-16546:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865303/HIVE-16546.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 94 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join0] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join1] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join29]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join30]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join_filters]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join_nulls]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_10]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_11]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_12]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_13]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_14]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_3]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_4]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_5]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_7]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_8]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_9]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_6]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_7]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer4]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[empty_join] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_1]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[identity_project_remove_skip]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join32_lessSize]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_partitioned]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.tes

[jira] [Commented] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988144#comment-15988144
 ] 

Gopal V commented on HIVE-16553:


Heh, +1

> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16553.1.patch
>
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16553:
--
Status: Patch Available  (was: Open)

> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16553.1.patch
>
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16553:
--
Attachment: HIVE-16553.1.patch

Whoops, didn't post the patch! posting now

> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16553.1.patch
>
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16546) LLAP: Fail map join tasks if hash table memory exceeds threshold

2017-04-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16546:
-
Attachment: HIVE-16546.2.patch

> LLAP: Fail map join tasks if hash table memory exceeds threshold
> 
>
> Key: HIVE-16546
> URL: https://issues.apache.org/jira/browse/HIVE-16546
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16546.1.patch, HIVE-16546.2.patch, 
> HIVE-16546.WIP.patch
>
>
> When map join task is running in llap, it can potentially use lot more memory 
> than its limit which could be memory per executor or no conditional task 
> size. If it uses more memory, it can adversely affect other query performance 
> or it can even bring down the daemon. In such cases, it is better to fail the 
> query than to bring down the daemon. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988116#comment-15988116
 ] 

Hive QA commented on HIVE-16465:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865302/HIVE-16465-branch-2.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10571 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=142)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=174)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4906/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4906/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4906/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865302 - PreCommit-HIVE-Build

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch, 
> HIVE-16465-branch-2.3.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988099#comment-15988099
 ] 

Gunther Hagleitner commented on HIVE-16553:
---

[~jdere] I don't see a patch - am I missing something?

> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15642) Replicate Insert Overwrites, Dynamic Partition Inserts and Loads

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988081#comment-15988081
 ] 

Hive QA commented on HIVE-15642:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847859/HIVE-15642.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4904/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4904/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4904/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-04-28 02:34:12.709
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-4904/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 02:34:12.712
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 02:34:13.385
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java: No such file 
or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: No such file 
or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12847859 - PreCommit-HIVE-Build

> Replicate Insert Overwrites, Dynamic Partition Inserts and Loads
> 
>
> Key: HIVE-15642
> URL: https://issues.apache.org/jira/browse/HIVE-15642
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15642.1.patch
>
>
> 1. Insert Overwrites to a new partition should not capture new files as part 
> of insert event but instead use the subsequent add partition event to capture 
> the files + checksums.
> 2. Insert Overwrites to an existing partition should capture new files as 
> part of the insert event. 
> Similar behaviour for DP inserts and loads.
> This will need changes from HIVE-15478



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988078#comment-15988078
 ] 

Hive QA commented on HIVE-16523:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865273/HIVE-16523.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=236)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4903/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4903/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4903/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865273 - PreCommit-HIVE-Build

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-04-27 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988020#comment-15988020
 ] 

Yongzhi Chen commented on HIVE-15997:
-

The shutdown will check the running list and shutdown each task in the list, 
shutdown on the task will stop the query because of failed task. But there are 
some chances the task is removed from the list(finished for example) when the 
shutdown happens. Add the if (driverContext.isShutdown()) may help this 
scenario. Even if later we can catch the cancel, it is the most prompt one. And 
if (driverContext.isShutdown()) check is not expensive call, so I think it is 
OK to have the check. 
{noformat}
 /**
   * Cleans up remaining tasks in case of failure
   */
  public synchronized void shutdown() {
LOG.debug("Shutting down query " + ctx.getCmd());
shutdown = true;
for (TaskRunner runner : running) {
  if (runner.isRunning()) {
Task task = runner.getTask();
LOG.warn("Shutting down task : " + task);
try {
  task.shutdown();
} catch (Exception e) {
  console.printError("Exception on shutting down task " + task.getId() 
+ ": " + e);
}
Thread thread = runner.getRunner();
if (thread != null) {
  thread.interrupt();
}
  }
}
running.clear();
  }
{noformat}

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 2.2.0
>
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  
> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  
> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>  
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>  
> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>  
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.Futur

[jira] [Commented] (HIVE-16546) LLAP: Fail map join tasks if hash table memory exceeds threshold

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988015#comment-15988015
 ] 

Hive QA commented on HIVE-16546:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865303/HIVE-16546.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 95 failed/errored test(s), 10628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_udf1] (batchId=84)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join0] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join1] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join29]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join30]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join_filters]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join_nulls]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_10]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_11]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_12]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_13]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_14]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_16]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_3]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_4]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_5]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_7]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_8]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_9]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_6]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_7]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer4]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[empty_join] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_1]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[identity_project_remove_skip]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join32_lessSize]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_partit

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: HIVE-16485.03.patch

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Open  (was: Patch Available)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Patch Available  (was: Open)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485.02.patch, 
> HIVE-16485.03.patch, HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987952#comment-15987952
 ] 

Hive QA commented on HIVE-14412:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865256/HIVE-14412.10.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4900/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4900/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4900/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:38:18.965
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-4900/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:38:18.968
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:38:19.565
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/contrib/src/test/queries/clientnegative/serde_regex.q: No such file or 
directory
error: a/contrib/src/test/queries/clientpositive/serde_regex.q: No such file or 
directory
error: a/contrib/src/test/results/clientnegative/serde_regex.q.out: No such 
file or directory
error: a/contrib/src/test/results/clientpositive/serde_regex.q.out: No such 
file or directory
error: a/hbase-handler/src/test/queries/positive/hbase_timestamp.q: No such 
file or directory
error: a/hbase-handler/src/test/results/positive/hbase_timestamp.q.out: No such 
file or directory
error: 
a/itests/hive-blobstore/src/test/queries/clientpositive/orc_format_part.q: No 
such file or directory
error: 
a/itests/hive-blobstore/src/test/queries/clientpositive/orc_nonstd_partitions_loc.q:
 No such file or directory
error: 
a/itests/hive-blobstore/src/test/queries/clientpositive/rcfile_format_part.q: 
No such file or directory
error: 
a/itests/hive-blobstore/src/test/queries/clientpositive/rcfile_nonstd_partitions_loc.q:
 No such file or directory
error: 
a/itests/hive-blobstore/src/test/results/clientpositive/orc_format_part.q.out: 
No such file or directory
error: 
a/itests/hive-blobstore/src/test/results/clientpositive/orc_nonstd_partitions_loc.q.out:
 No such file or directory
error: 
a/itests/hive-blobstore/src/test/results/clientpositive/rcfile_format_part.q.out:
 No such file or directory
error: 
a/itests/hive-blobstore/src/test/results/clientpositive/rcfile_nonstd_partitions_loc.q.out:
 No such file or directory
error: a/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java: No such 
file or directory
error: a/jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java: No such file or 
directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java: No 
such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java: No 
such file or directory
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/TypeConverter.java:
 No such file or directory
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemantic

[jira] [Commented] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987951#comment-15987951
 ] 

Hive QA commented on HIVE-16485:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865246/HIVE-16485.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4898/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4898/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4898/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:36:42.110
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-4898/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:36:42.113
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at fefeb2a HIVE-16542 make merge that targets acid 2.0 table 
fail-fast (Eugene Koifman, reviewed by Wei Zheng)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-04-28 00:36:43.065
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p1
patching file itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java
Hunk #1 succeeded at 1728 (offset 6 lines).
Hunk #2 succeeded at 1745 (offset 6 lines).
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/AnnotateReduceSinkOutputOperator.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
patching file ql/src/test/queries/clientpositive/explain_formatted_oid.q
patching file ql/src/test/results/clientpositive/explain_formatted_oid.q.out
patching file ql/src/test/results/clientpositive/input4.q.out
patching file ql/src/test/results/clientpositive/join0.q.out
patching file ql/src/test/results/clientpositive/parallel_join0.q.out
patching file ql/src/test/results/clientpositive/plan_json.q.out
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
DataNucleus Enhancer (version 4.1.17) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.

[jira] [Commented] (HIVE-16541) PTF: Avoid shuffling constant keys for empty OVER()

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987950#comment-15987950
 ] 

Hive QA commented on HIVE-16541:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865216/HIVE-16541.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptf_matchpath] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[quotedid_basic] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_gby2] 
(batchId=34)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_windowing]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[groupby_resolution]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf_matchpath]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[windowing] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[windowing_gby]
 (batchId=150)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_resolution] 
(batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ptf_matchpath] 
(batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[windowing] 
(batchId=120)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4897/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4897/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865216 - PreCommit-HIVE-Build

> PTF: Avoid shuffling constant keys for empty OVER()
> ---
>
> Key: HIVE-16541
> URL: https://issues.apache.org/jira/browse/HIVE-16541
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16541.1.patch
>
>
> Generating surrogate keys with 
> {code}
> select row_number() over() as p_key, * from table; 
> {code}
> uses a sorted edge with "0 ASC NULLS FIRST" as the sort order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16213) ObjectStore can leak Queries when rollbackTransaction throws an exception

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16213:
---
Attachment: HIVE-16213.07.patch

Fixed the test failure. 

> ObjectStore can leak Queries when rollbackTransaction throws an exception
> -
>
> Key: HIVE-16213
> URL: https://issues.apache.org/jira/browse/HIVE-16213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Alexander Kolbasov
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16213.01.patch, HIVE-16213.02.patch, 
> HIVE-16213.03.patch, HIVE-16213.04.patch, HIVE-16213.05.patch, 
> HIVE-16213.06.patch, HIVE-16213.07.patch
>
>
> In ObjectStore.java there are a few places with the code similar to:
> {code}
> Query query = null;
> try {
>   openTransaction();
>   query = pm.newQuery(Something.class);
>   ...
>   commited = commitTransaction();
> } finally {
>   if (!commited) {
> rollbackTransaction();
>   }
>   if (query != null) {
> query.closeAll();
>   }
> }
> {code}
> The problem is that rollbackTransaction() may throw an exception in which 
> case query.closeAll() wouldn't be executed. 
> The fix would be to wrap rollbackTransaction in its own try-catch block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16213) ObjectStore can leak Queries when rollbackTransaction throws an exception

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987898#comment-15987898
 ] 

Hive QA commented on HIVE-16213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865202/HIVE-16213.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10633 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=194)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4896/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4896/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4896/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865202 - PreCommit-HIVE-Build

> ObjectStore can leak Queries when rollbackTransaction throws an exception
> -
>
> Key: HIVE-16213
> URL: https://issues.apache.org/jira/browse/HIVE-16213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Alexander Kolbasov
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16213.01.patch, HIVE-16213.02.patch, 
> HIVE-16213.03.patch, HIVE-16213.04.patch, HIVE-16213.05.patch, 
> HIVE-16213.06.patch
>
>
> In ObjectStore.java there are a few places with the code similar to:
> {code}
> Query query = null;
> try {
>   openTransaction();
>   query = pm.newQuery(Something.class);
>   ...
>   commited = commitTransaction();
> } finally {
>   if (!commited) {
> rollbackTransaction();
>   }
>   if (query != null) {
> query.closeAll();
>   }
> }
> {code}
> The problem is that rollbackTransaction() may throw an exception in which 
> case query.closeAll() wouldn't be executed. 
> The fix would be to wrap rollbackTransaction in its own try-catch block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16556) Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES table

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16556:
--


> Modify schematool scripts to initialize and create METASTORE_DB_PROPERTIES 
> table
> 
>
> Key: HIVE-16556
> URL: https://issues.apache.org/jira/browse/HIVE-16556
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> sub-task to modify schema tool and its related changes so that the new table 
> is added to the schema when schematool initializes or upgrades the schema.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16555) Add a new thrift API call for get_metastore_uuid

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16555:
---
Summary: Add a new thrift API call for get_metastore_uuid  (was: Added a 
new thrift API call for get_metastore_uuid)

> Add a new thrift API call for get_metastore_uuid
> 
>
> Key: HIVE-16555
> URL: https://issues.apache.org/jira/browse/HIVE-16555
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Sub-task of the main JIRA to add the new thrift API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16555) Added a new thrift API call for get_metastore_uuid

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16555:
--


> Added a new thrift API call for get_metastore_uuid
> --
>
> Key: HIVE-16555
> URL: https://issues.apache.org/jira/browse/HIVE-16555
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Sub-task of the main JIRA to add the new thrift API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled

2017-04-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987880#comment-15987880
 ] 

Xuefu Zhang commented on HIVE-15997:


Hi [~ychena]/[~ctang.ma], Thanks for looking into this. I'm trying to 
understand the patch, as I'm reviewing HIVE-16552, which refers to this JIRA. 
Specifically, I don't quite understand the following code addtion:
{code}
   rj = jc.submitJob(job);
+
+  if (driverContext.isShutdown()) {
+LOG.warn("Task was cancelled");
+if (rj != null) {
+  rj.killJob();
+  rj = null;
+}
+return 5;
+  }
+
   this.jobID = rj.getJobID();
{code}
I understand we are checking if query is cancelled right after submission. 
However, my question is whether this check is necessary or complete as right 
after this check the query can become cancelled, which the check will not 
capture. If such cancellation is captured later in other code path, then the 
check here seems not necessary. Otherwise, this check will not capture all 
scenarios.

Did I miss anything? Thanks.

> Resource leaks when query is cancelled 
> ---
>
> Key: HIVE-15997
> URL: https://issues.apache.org/jira/browse/HIVE-15997
> Project: Hive
>  Issue Type: Bug
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 2.2.0
>
> Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: 
> Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local 
> exception: java.nio.channels.ClosedByInterruptException; Host Details : local 
> host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: 
> "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  
> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:606) 
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  
> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
>  
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
>  
> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>  
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
>  
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> at java.lang.Thread.run(Thread.java:745) 
> 

[jira] [Updated] (HIVE-16143) Improve msck repair batching

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16143:
---
Status: Patch Available  (was: In Progress)

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)



[jira] [Updated] (HIVE-16143) Improve msck repair batching

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16143:
---
Attachment: HIVE-16143.01.patch

> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16143.01.patch
>
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work started] (HIVE-16143) Improve msck repair batching

2017-04-27 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16143 started by Vihang Karajgaonkar.
--
> Improve msck repair batching
> 
>
> Key: HIVE-16143
> URL: https://issues.apache.org/jira/browse/HIVE-16143
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> Currently, the {{msck repair table}} command batches the number of partitions 
> created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. 
> Following snippet shows the batching logic. There can be couple of 
> improvements to this batching logic:
> {noformat} 
> int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE);
>   if (batch_size > 0 && partsNotInMs.size() > batch_size) {
> int counter = 0;
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   counter++;
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
>   if (counter % batch_size == 0 || counter == 
> partsNotInMs.size()) {
> db.createPartitions(apd);
> apd = new AddPartitionDesc(table.getDbName(), 
> table.getTableName(), false);
>   }
> }
>   } else {
> for (CheckResult.PartitionResult part : partsNotInMs) {
>   
> apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null);
>   repairOutput.add("Repair: Added partition to metastore " + 
> msckDesc.getTableName()
>   + ':' + part.getPartitionName());
> }
> db.createPartitions(apd);
>   }
> } catch (Exception e) {
>   LOG.info("Could not bulk-add partitions to metastore; trying one by 
> one", e);
>   repairOutput.clear();
>   msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput);
> }
> {noformat}
> 1. If the batch size is too aggressive the code falls back to adding 
> partitions one by one which is almost always very slow. It is easily possible 
> that users increase the batch size to higher value to make the command run 
> faster but end up with a worse performance because code falls back to adding 
> one by one. Users are then expected to determine the tuned value of batch 
> size which works well for their environment. I think the code could handle 
> this situation better by exponentially decaying the batch size instead of 
> falling back to one by one.
> 2. The other issue with this implementation is if lets say first batch 
> succeeds and the second one fails, the code tries to add all the partitions 
> one by one irrespective of whether some of the were successfully added or 
> not. If we need to fall back to one by one we should atleast remove the ones 
> which we know for sure are already added successfully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16207) Add support for Complex Types in Fast SerDe

2017-04-27 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-16207:
--
Attachment: HIVE-16207.1.patch.zip

Here's a new partial patch. I tried but I don't know why HIVE-16207.1.patch 
file is not able to upload. I compressed it with zip then uploaded it.

This patch passes almost all tests except 
TestVectorSerDe.testVectorBinarySortableDeserializeRow and 
TestVectorSerDe.testVectorLazySimpleDeserializeRow. I need help on these 
failures.

> Add support for Complex Types in Fast SerDe
> ---
>
> Key: HIVE-16207
> URL: https://issues.apache.org/jira/browse/HIVE-16207
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: HIVE-16207.1.patch.zip, partial.patch
>
>
> Add complex type support to Fast SerDe classes.  This is needed for fully 
> supporting complex types in Vectorization



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Work stopped] (HIVE-16207) Add support for Complex Types in Fast SerDe

2017-04-27 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16207 stopped by Teddy Choi.
-
> Add support for Complex Types in Fast SerDe
> ---
>
> Key: HIVE-16207
> URL: https://issues.apache.org/jira/browse/HIVE-16207
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Teddy Choi
>Priority: Critical
> Attachments: partial.patch
>
>
> Add complex type support to Fast SerDe classes.  This is needed for fully 
> supporting complex types in Vectorization



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16554) ACID: Make HouseKeeperService threads daemon

2017-04-27 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-16554:
---


> ACID: Make HouseKeeperService threads daemon
> 
>
> Key: HIVE-16554
> URL: https://issues.apache.org/jira/browse/HIVE-16554
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.1, 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16542) make merge that targets acid 2.0 table fail-fast

2017-04-27 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16542:
--
Attachment: HIVE-16542.01-branch-2.patch

> make merge that targets acid 2.0 table fail-fast 
> -
>
> Key: HIVE-16542
> URL: https://issues.apache.org/jira/browse/HIVE-16542
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16542.01-branch-2.patch, HIVE-16542.01.patch, 
> HIVE-16542.02.patch
>
>
> Until HIVE-14947 is fixed, need to add a check so that acid 2.0 tables are 
> not written to by Merge stmt that has both Insert and Update clauses



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15160) Can't order by an unselected column

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987829#comment-15987829
 ] 

Hive QA commented on HIVE-15160:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865182/HIVE-15160.11.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 53 failed/errored test(s), 10642 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_3] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_4] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_vc] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[regex_col] (batchId=15)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_queries] 
(batchId=92)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_gby] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_gby] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_semijoin]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_semijoin]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_not_in]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[selectDistinctStar]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_date_1]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_2]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_round]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_grouping]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets_limit]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_interval_1]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_interval_arithmetic]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_null_projection]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=151)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_view_6]
 (batchId=88)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_view_7]
 (batchId=88)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query66] 
(batchId=229)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join8] 
(batchId=134)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=98)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_gby] 
(batchId=117)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_limit] 
(batchId=136)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_semijoin] 
(batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_subq_not_in] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkC

[jira] [Commented] (HIVE-13566) Auto-gather column stats - phase 1

2017-04-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987824#comment-15987824
 ] 

Prasanth Jayachandran commented on HIVE-13566:
--

This patch removed the changes from HIVE-13628 as well. Not sure what else this 
must have removed. 

> Auto-gather column stats - phase 1
> --
>
> Key: HIVE-13566
> URL: https://issues.apache.org/jira/browse/HIVE-13566
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13566.01.patch, HIVE-13566.02.patch, 
> HIVE-13566.03.patch
>
>
> This jira adds code and tests for auto-gather column stats. Golden file 
> update will be done in phase 2 - HIVE-11160



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16541) PTF: Avoid shuffling constant keys for empty OVER()

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16541:
---
Status: Open  (was: Patch Available)

> PTF: Avoid shuffling constant keys for empty OVER()
> ---
>
> Key: HIVE-16541
> URL: https://issues.apache.org/jira/browse/HIVE-16541
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16541.1.patch
>
>
> Generating surrogate keys with 
> {code}
> select row_number() over() as p_key, * from table; 
> {code}
> uses a sorted edge with "0 ASC NULLS FIRST" as the sort order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-16523:
--

Assignee: Sergey Shelukhin  (was: Gopal V)

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16523:
---
Status: Patch Available  (was: Open)

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-16523:
--

Assignee: Gopal V  (was: Sergey Shelukhin)

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16541) PTF: Avoid shuffling constant keys for empty OVER()

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16541:
---
Status: Patch Available  (was: Open)

> PTF: Avoid shuffling constant keys for empty OVER()
> ---
>
> Key: HIVE-16541
> URL: https://issues.apache.org/jira/browse/HIVE-16541
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16541.1.patch
>
>
> Generating surrogate keys with 
> {code}
> select row_number() over() as p_key, * from table; 
> {code}
> uses a sorted edge with "0 ASC NULLS FIRST" as the sort order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-16523:
---
Status: Open  (was: Patch Available)

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987807#comment-15987807
 ] 

Sergey Shelukhin commented on HIVE-16523:
-

+1 on the updates over the patch

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987794#comment-15987794
 ] 

Jason Dere commented on HIVE-16553:
---

[~hagleitn] [~gopalv] can you review? Just a simple default config change.

> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16553) Change default value for hive.tez.bigtable.minsize.semijoin.reduction

2017-04-27 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-16553:
-


> Change default value for hive.tez.bigtable.minsize.semijoin.reduction
> -
>
> Key: HIVE-16553
> URL: https://issues.apache.org/jira/browse/HIVE-16553
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Current value is 1M rows, would like to bump this up to make sure we are not 
> creating semjoin optimizations on dimension tables, since having too many 
> semijoin optimizations can cause serialized execution of tasks if lots of 
> tasks are waiting for semijoin optimizations to be computed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987783#comment-15987783
 ] 

Chaoyu Tang commented on HIVE-16147:


The test failures are not related to the patch. [~pxiong], could you help to 
review it again? Thanks

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358  
> {code}
> describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
> column stats have been dropped.
> {code}
> # col_namedata_type   comment 
>  
>   
>  
> salaryint from deserializer   
>  
> Time taken: 0.131 seconds, Fetched: 3 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16534) Add capability to tell aborted transactions apart from open transactions in ValidTxnList

2017-04-27 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987768#comment-15987768
 ] 

Wei Zheng commented on HIVE-16534:
--

[~ekoifman] Can you take a look at patch 2?

> Add capability to tell aborted transactions apart from open transactions in 
> ValidTxnList
> 
>
> Key: HIVE-16534
> URL: https://issues.apache.org/jira/browse/HIVE-16534
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-16534.1.patch, HIVE-16534.2.patch
>
>
> Currently in ValidReadTxnList, open transactions and aborted transactions are 
> stored together in one array. That makes it impossible to extract just 
> aborted transactions or open transactions.
> For ValidCompactorTxnList this is fine, since we only store aborted 
> transactions but no open transactions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16534) Add capability to tell aborted transactions apart from open transactions in ValidTxnList

2017-04-27 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-16534:
-
Attachment: HIVE-16534.2.patch

> Add capability to tell aborted transactions apart from open transactions in 
> ValidTxnList
> 
>
> Key: HIVE-16534
> URL: https://issues.apache.org/jira/browse/HIVE-16534
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-16534.1.patch, HIVE-16534.2.patch
>
>
> Currently in ValidReadTxnList, open transactions and aborted transactions are 
> stored together in one array. That makes it impossible to extract just 
> aborted transactions or open transactions.
> For ValidCompactorTxnList this is fine, since we only store aborted 
> transactions but no open transactions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16542) make merge that targets acid 2.0 table fail-fast

2017-04-27 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987735#comment-15987735
 ] 

Wei Zheng commented on HIVE-16542:
--

LGTM. +1

> make merge that targets acid 2.0 table fail-fast 
> -
>
> Key: HIVE-16542
> URL: https://issues.apache.org/jira/browse/HIVE-16542
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16542.01.patch, HIVE-16542.02.patch
>
>
> Until HIVE-14947 is fixed, need to add a check so that acid 2.0 tables are 
> not written to by Merge stmt that has both Insert and Update clauses



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16147) Rename a partitioned table should not drop its partition columns stats

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987696#comment-15987696
 ] 

Hive QA commented on HIVE-16147:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865390/HIVE-16147.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10640 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.ql.parse.TestParseNegativeDriver.testCliDriver[wrong_distinct2]
 (batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4894/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4894/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4894/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865390 - PreCommit-HIVE-Build

> Rename a partitioned table should not drop its partition columns stats
> --
>
> Key: HIVE-16147
> URL: https://issues.apache.org/jira/browse/HIVE-16147
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16147.1.patch, HIVE-16147.patch, HIVE-16147.patch
>
>
> When a partitioned table (e.g. sample_pt) is renamed (e.g to 
> sample_pt_rename), describing its partition shows that the partition column 
> stats are still accurate, but actually they all have been dropped.
> It could be reproduce as following:
> 1. analyze table sample_pt compute statistics for columns;
> 2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
> for all columns are true
> {code}
> ...
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: file:/user/hive/warehouse/apache/sample_pt/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>   last_modified_time  1485217063  
>   numFiles1   
>   numRows 100 
>   rawDataSize 5143
>   totalSize   5243
>   transient_lastDdlTime   1488842358
> ... 
> {code}
> 3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
> stats exists
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment 
>   
>  
> salaryint 1   151370  
> 0   94
>   
> from deserializer 
> {code}
> 4. alter table sample_pt rename to sample_pt_rename;
> 5. describe formatted default.sample_pt_rename partition (dummy = 3): 
> describe the rename table partition (dummy =3) shows that COLUMN_STATS for 
> columns are still true.
> {code}
> # Detailed Partition Information   
> Partition Value:  [3]  
> Database: default  
> Table:sample_pt_rename 
> CreateTime:   Fri Jan 20 15:42:30 EST 2017 
> LastAccessTime:   UNKNOWN  
> Location: 
> file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3
> Partition Parameters:  
>   COLUMN_STATS_ACCURATE   
> {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
>   last_modified_byctang   
>

[jira] [Commented] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987688#comment-15987688
 ] 

Gopal V commented on HIVE-16523:


bq. is it a good idea to store hash ctx in every wrapper?

That was an assumption - the current object has a 4 byte slack.

{code}
 44 4   org.apache.hadoop.hive.common.type.HiveIntervalDayTime[] 
VectorHashKeyWrapper.intervalDayTimeValues   null
 48 4  boolean[] 
VectorHashKeyWrapper.isNull  null
 52 4(loss 
due to the next object alignment)
{code}


> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Attachment: HIVE-16366-branch-2.3.patch

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Attachment: (was: HIVE-16366-branch-2.3.patch)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16538) TestExecDriver fails if run after TestOperators#testScriptOperator

2017-04-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16538:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Yussuf!

> TestExecDriver fails if run after TestOperators#testScriptOperator
> --
>
> Key: HIVE-16538
> URL: https://issues.apache.org/jira/browse/HIVE-16538
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 3.0.0
> Environment: # cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"
> # uname -a
> Linux 3b9700711ca1 3.19.0-37-generic #42-Ubuntu SMP Fri Nov 20 18:22:05 UTC 
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Yussuf Shaikh
>Assignee: Yussuf Shaikh
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16538.patch
>
>
> Failed tests:
>   TestExecDriver.testMapPlan1:498->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapPlan2:506->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan1:515->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan2:524->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan3:533->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan4:542->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan5:551->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan6:560->fileDiff:182 expected: but 
> was:



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16536) Various improvements in TestPerfCliDriver

2017-04-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16536:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Various improvements in TestPerfCliDriver
> -
>
> Key: HIVE-16536
> URL: https://issues.apache.org/jira/browse/HIVE-16536
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 3.0.0
>
> Attachments: HIVE-16536.2.patch, HIVE-16536.3.patch, 
> HIVE-16536.4.patch, HIVE-16536.patch
>
>
> Goal is to reduce the size of stats file used to import stats in metastore. 
> This will help to run this tests on partitioned tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16456) Kill spark job when InterruptedException happens or driverContext.isShutdown is true.

2017-04-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987607#comment-15987607
 ] 

Xuefu Zhang commented on HIVE-16456:


Hi [~zxu], Thanks for working on this. Could you please provide a RB link for 
your patch? Thanks.

[~lirui], please review as well. Thanks.

> Kill spark job when InterruptedException happens or driverContext.isShutdown 
> is true.
> -
>
> Key: HIVE-16456
> URL: https://issues.apache.org/jira/browse/HIVE-16456
> Project: Hive
>  Issue Type: Improvement
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Minor
> Attachments: HIVE-16456.000.patch
>
>
> Kill spark job when InterruptedException happens or driverContext.isShutdown 
> is true. If the InterruptedException happened in RemoteSparkJobMonitor and 
> LocalSparkJobMonitor, it will be better to kill the job. Also there is a race 
> condition between submit the spark job and query/operation cancellation, it 
> will be better to check driverContext.isShutdown right after submit the spark 
> job. This will guarantee the job being killed no matter when shutdown is 
> called. It is similar as HIVE-15997.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-04-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16552:
---
Attachment: HIVE-16552.patch

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-04-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16552:
---
Status: Patch Available  (was: Open)

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.0.0, 1.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-16552.patch
>
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16545) LLAP: bug in arena size determination logic

2017-04-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16545:

Fix Version/s: 3.0.0
   2.3.0
   2.2.0

> LLAP: bug in arena size determination logic
> ---
>
> Key: HIVE-16545
> URL: https://issues.apache.org/jira/browse/HIVE-16545
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.3.0, 3.0.0
>
> Attachments: HIVE-16545.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16545) LLAP: bug in arena size determination logic

2017-04-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16545:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed everywhere. Thanks for the review!

> LLAP: bug in arena size determination logic
> ---
>
> Key: HIVE-16545
> URL: https://issues.apache.org/jira/browse/HIVE-16545
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16545.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16547) LLAP: may not unlock buffers in some cases

2017-04-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16547:

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.3.0
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed everywhere. Thanks for the review!

> LLAP: may not unlock buffers in some cases
> --
>
> Key: HIVE-16547
> URL: https://issues.apache.org/jira/browse/HIVE-16547
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.3.0, 3.0.0
>
> Attachments: HIVE-16547.patch
>
>
> Actually this is a pretty major bug, no idea how it slipped before.
> If last RG is not selected, dictionary buffers will not be unlocked because 
> of bad assumptions about what isLastRg means (last in processing vs last in 
> the stripe).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-04-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16552:
---
Affects Version/s: 1.0.0
   2.0.0

> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16552) Limit the number of tasks a Spark job may contain

2017-04-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-16552:
--


> Limit the number of tasks a Spark job may contain
> -
>
> Key: HIVE-16552
> URL: https://issues.apache.org/jira/browse/HIVE-16552
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> It's commonly desirable to block bad and big queries that takes a lot of YARN 
> resources. One approach, similar to mapreduce.job.max.map in MapReduce, is to 
> stop a query that invokes a Spark job that contains too many tasks. The 
> proposal here is to introduce hive.spark.job.max.tasks with a default value 
> of -1 (no limit), which an admin can set to block queries that trigger too 
> many spark tasks.
> Please note that this control knob applies to a spark job, though it's 
> possible that one query can trigger multiple Spark jobs (such as in case of 
> map-join). Nevertheless, the proposed approach is still helpful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16536) Various improvements in TestPerfCliDriver

2017-04-27 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987451#comment-15987451
 ] 

Jesus Camacho Rodriguez commented on HIVE-16536:


+1

> Various improvements in TestPerfCliDriver
> -
>
> Key: HIVE-16536
> URL: https://issues.apache.org/jira/browse/HIVE-16536
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-16536.2.patch, HIVE-16536.3.patch, 
> HIVE-16536.4.patch, HIVE-16536.patch
>
>
> Goal is to reduce the size of stats file used to import stats in metastore. 
> This will help to run this tests on partitioned tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16542) make merge that targets acid 2.0 table fail-fast

2017-04-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987382#comment-15987382
 ] 

Eugene Koifman commented on HIVE-16542:
---

[~wzheng] could you review please

> make merge that targets acid 2.0 table fail-fast 
> -
>
> Key: HIVE-16542
> URL: https://issues.apache.org/jira/browse/HIVE-16542
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16542.01.patch, HIVE-16542.02.patch
>
>
> Until HIVE-14947 is fixed, need to add a check so that acid 2.0 tables are 
> not written to by Merge stmt that has both Insert and Update clauses



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16542) make merge that targets acid 2.0 table fail-fast

2017-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987369#comment-15987369
 ] 

Hive QA commented on HIVE-16542:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865407/HIVE-16542.02.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4893/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4893/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4893/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865407 - PreCommit-HIVE-Build

> make merge that targets acid 2.0 table fail-fast 
> -
>
> Key: HIVE-16542
> URL: https://issues.apache.org/jira/browse/HIVE-16542
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16542.01.patch, HIVE-16542.02.patch
>
>
> Until HIVE-14947 is fixed, need to add a check so that acid 2.0 tables are 
> not written to by Merge stmt that has both Insert and Update clauses



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-27 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987241#comment-15987241
 ] 

Misha Dmitriev commented on HIVE-16079:
---

Thank you [~spena]. Yes, this patch contains changes that are not backward 
compatible with JDK versions earlier than 8. So I guess it can only be 
committed to master.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16079:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~mi...@cloudera.com] for your contribution. I committed this to master. 
Correct me if I'm wrong, but I won't commit it to branch-2 due to JDK8 
exclusive changes.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987225#comment-15987225
 ] 

Sergio Peña commented on HIVE-16079:


Agree. Those tests are flaky.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16366) Hive 2.3 release planning

2017-04-27 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987200#comment-15987200
 ] 

Pengcheng Xiong commented on HIVE-16366:


The failed tests can not be reproduced locally. 

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16547) LLAP: may not unlock buffers in some cases

2017-04-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987154#comment-15987154
 ] 

Prasanth Jayachandran commented on HIVE-16547:
--

+1

> LLAP: may not unlock buffers in some cases
> --
>
> Key: HIVE-16547
> URL: https://issues.apache.org/jira/browse/HIVE-16547
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16547.patch
>
>
> Actually this is a pretty major bug, no idea how it slipped before.
> If last RG is not selected, dictionary buffers will not be unlocked because 
> of bad assumptions about what isLastRg means (last in processing vs last in 
> the stripe).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16538) TestExecDriver fails if run after TestOperators#testScriptOperator

2017-04-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987150#comment-15987150
 ] 

Ashutosh Chauhan commented on HIVE-16538:
-

+1

> TestExecDriver fails if run after TestOperators#testScriptOperator
> --
>
> Key: HIVE-16538
> URL: https://issues.apache.org/jira/browse/HIVE-16538
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 3.0.0
> Environment: # cat /etc/lsb-release
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=14.04
> DISTRIB_CODENAME=trusty
> DISTRIB_DESCRIPTION="Ubuntu 14.04.5 LTS"
> # uname -a
> Linux 3b9700711ca1 3.19.0-37-generic #42-Ubuntu SMP Fri Nov 20 18:22:05 UTC 
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Yussuf Shaikh
>Assignee: Yussuf Shaikh
>Priority: Minor
> Attachments: HIVE-16538.patch
>
>
> Failed tests:
>   TestExecDriver.testMapPlan1:498->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapPlan2:506->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan1:515->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan2:524->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan3:533->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan4:542->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan5:551->fileDiff:182 expected: but 
> was:
>   TestExecDriver.testMapRedPlan6:560->fileDiff:182 expected: but 
> was:



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16537) Add missing AL files

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16537:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add missing AL files
> 
>
> Key: HIVE-16537
> URL: https://issues.apache.org/jira/browse/HIVE-16537
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16537.01.patch, HIVE-16537.02.patch, 
> HIVE-16537.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16537) Add missing AL files

2017-04-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16537:
---
Affects Version/s: 2.3.0

> Add missing AL files
> 
>
> Key: HIVE-16537
> URL: https://issues.apache.org/jira/browse/HIVE-16537
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.3.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16537.01.patch, HIVE-16537.02.patch, 
> HIVE-16537.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16536) Various improvements in TestPerfCliDriver

2017-04-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987139#comment-15987139
 ] 

Ashutosh Chauhan commented on HIVE-16536:
-

Before the patch : size of TAB_COL_STATS.txt was 40K and of TABLE_PARAMS.txt 
15K. After this patch: they are 6.1K and 481B respectively. 10X reduction. This 
allows us to introduce partition level stats for partitioned table in our 
PerfCliDriver.
[~jcamachorodriguez] Can you take a look?

> Various improvements in TestPerfCliDriver
> -
>
> Key: HIVE-16536
> URL: https://issues.apache.org/jira/browse/HIVE-16536
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-16536.2.patch, HIVE-16536.3.patch, 
> HIVE-16536.4.patch, HIVE-16536.patch
>
>
> Goal is to reduce the size of stats file used to import stats in metastore. 
> This will help to run this tests on partitioned tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16548) LLAP: EncodedReaderImpl.addOneCompressionBuffer throws NPE

2017-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987136#comment-15987136
 ] 

Sergey Shelukhin commented on HIVE-16548:
-

Hmm... what was the order of these errors? Do you have a repro, or logs?

> LLAP: EncodedReaderImpl.addOneCompressionBuffer throws NPE
> --
>
> Key: HIVE-16548
> URL: https://issues.apache.org/jira/browse/HIVE-16548
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>
> Env: Based on apr-25 apache master codebase.
> {noformat}
> Caused by: java.io.IOException: java.lang.IllegalArgumentException: Buffer 
> size too small. size = 65536 needed = 3762509
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:695)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:454)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:420)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:242)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:239)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> Caused by: java.lang.IllegalArgumentException: Buffer size too small. size = 
> 65536 needed = 3762509
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.addOneCompressionBuffer(EncodedReaderImpl.java:1223)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.prepareRangesForCompressedRead(EncodedReaderImpl.java:813)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:685)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:695)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:454)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:420)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:242)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:239)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.addOneCompressionBuffer(EncodedReaderImpl.java:1282)
> at 
> org.apache.hadoop.hive.ql.i

[jira] [Updated] (HIVE-16542) make merge that targets acid 2.0 table fail-fast

2017-04-27 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16542:
--
Attachment: HIVE-16542.02.patch

> make merge that targets acid 2.0 table fail-fast 
> -
>
> Key: HIVE-16542
> URL: https://issues.apache.org/jira/browse/HIVE-16542
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16542.01.patch, HIVE-16542.02.patch
>
>
> Until HIVE-14947 is fixed, need to add a check so that acid 2.0 tables are 
> not written to by Merge stmt that has both Insert and Update clauses



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16523) VectorHashKeyWrapper hash code for strings is not so good

2017-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987133#comment-15987133
 ] 

Sergey Shelukhin commented on HIVE-16523:
-

Hmm.. is it a good idea to store hash ctx in every wrapper? it seems like 1000s 
of refs of overhead.

> VectorHashKeyWrapper hash code for strings is not so good
> -
>
> Key: HIVE-16523
> URL: https://issues.apache.org/jira/browse/HIVE-16523
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16523.01.patch, HIVE-16523.02.patch, 
> HIVE-16523.patch
>
>
> Perf issues in vectorized gby on some string keys



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16537) Add missing AL files

2017-04-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987125#comment-15987125
 ] 

Ashutosh Chauhan commented on HIVE-16537:
-

+1

> Add missing AL files
> 
>
> Key: HIVE-16537
> URL: https://issues.apache.org/jira/browse/HIVE-16537
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Fix For: 2.3.0
>
> Attachments: HIVE-16537.01.patch, HIVE-16537.02.patch, 
> HIVE-16537.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >