[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Status: Patch Available  (was: In Progress)

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Status: In Progress  (was: Patch Available)

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027213#comment-16027213
 ] 

Hive QA commented on HIVE-16778:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870176/HIVE-16778.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[named_column_join] 
(batchId=72)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5455/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5455/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5455/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870176 - PreCommit-HIVE-Build

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16778.patch, HIVE-16778.patch
>
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases

2017-05-26 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027207#comment-16027207
 ] 

liyunzhang_intel commented on HIVE-16600:
-

[~lirui]: actually the algorithms in HIVE-16600.9.patch is similar as yours. 
The safe way to judge an order by limit case is 
1. verify whether there is a limit between current RS to next RS/FS in non 
multi-insert case.
2. verify whether there is a limit between current RS to jointOperator in 
multi-insert case. Here jointOperator is the operator where the branches start.


> Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel 
> order by in multi_insert cases
> 
>
> Key: HIVE-16600
> URL: https://issues.apache.org/jira/browse/HIVE-16600
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, 
> HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, 
> HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, 
> HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600
>
>
> multi_insert_gby.case.q
> {code}
> set hive.exec.reducers.bytes.per.reducer=256;
> set hive.optimize.sampling.orderby=true;
> drop table if exists e1;
> drop table if exists e2;
> create table e1 (key string, value string);
> create table e2 (key string);
> FROM (select key, cast(key as double) as keyD, value from src order by key) a
> INSERT OVERWRITE TABLE e1
> SELECT key, value
> INSERT OVERWRITE TABLE e2
> SELECT key;
> select * from e1;
> select * from e2;
> {code} 
> the parallelism of Sort is 1 even we enable parallel order 
> by("hive.optimize.sampling.orderby" is set as "true").  This is not 
> reasonable because the parallelism  should be calcuated by  
> [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170]
> this is because SetSparkReducerParallelism#needSetParallelism returns false 
> when [children size of 
> RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207]
>  is greater than 1.
> in this case, the children size of {{RS[2]}} is two.
> the logical plan of the case
> {code}
>TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5]
> -SEL[6]-FS[7]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027206#comment-16027206
 ] 

Lefty Leverenz commented on HIVE-16285:
---

Does this need to be documented in the wiki?

Here's the logging section:

* [Getting Started -- Hive Logging | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-HiveLogging]

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16760) Update errata.txt for HIVE-16743

2017-05-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027198#comment-16027198
 ] 

Lefty Leverenz commented on HIVE-16760:
---

[~thejas], shouldn't all changes to the code be tracked in JIRA?  I agree that 
updating errata.txt is a minor change and doesn't need any review, but I'm 
uneasy about doing it without a JIRA ticket.

Since I tend to catch many of these errors, I'd like to give good advice on how 
to do the updates.

Should we have a discussion on the dev@hive mailing list?  Or else we could 
open a JIRA ticket to document errata.txt in the wiki and we could have the 
discussion in the comments.

> Update errata.txt for HIVE-16743
> 
>
> Key: HIVE-16760
> URL: https://issues.apache.org/jira/browse/HIVE-16760
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-16760.patch
>
>
> Refer to:
> https://issues.apache.org/jira/browse/HIVE-16743?focusedCommentId=16024139=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16024139



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases

2017-05-26 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027194#comment-16027194
 ] 

Rui Li commented on HIVE-16600:
---

[~kellyzly], one example of non multi insert having branches is dynamic 
partition pruning. I agree such cases may also be eligible for parallel order 
by, but that needs further investigation. So let's limit the scope to multi 
insert here. Does that make sense?

> Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel 
> order by in multi_insert cases
> 
>
> Key: HIVE-16600
> URL: https://issues.apache.org/jira/browse/HIVE-16600
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, 
> HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, 
> HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, 
> HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600
>
>
> multi_insert_gby.case.q
> {code}
> set hive.exec.reducers.bytes.per.reducer=256;
> set hive.optimize.sampling.orderby=true;
> drop table if exists e1;
> drop table if exists e2;
> create table e1 (key string, value string);
> create table e2 (key string);
> FROM (select key, cast(key as double) as keyD, value from src order by key) a
> INSERT OVERWRITE TABLE e1
> SELECT key, value
> INSERT OVERWRITE TABLE e2
> SELECT key;
> select * from e1;
> select * from e2;
> {code} 
> the parallelism of Sort is 1 even we enable parallel order 
> by("hive.optimize.sampling.orderby" is set as "true").  This is not 
> reasonable because the parallelism  should be calcuated by  
> [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170]
> this is because SetSparkReducerParallelism#needSetParallelism returns false 
> when [children size of 
> RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207]
>  is greater than 1.
> in this case, the children size of {{RS[2]}} is two.
> the logical plan of the case
> {code}
>TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5]
> -SEL[6]-FS[7]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16743) BitSet set() is incorrectly used in TxnUtils.createValidCompactTxnList()

2017-05-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027192#comment-16027192
 ] 

Lefty Leverenz commented on HIVE-16743:
---

Thanks for updating errata.txt with HIVE-16760, [~wzheng].

> BitSet set() is incorrectly used in TxnUtils.createValidCompactTxnList()
> 
>
> Key: HIVE-16743
> URL: https://issues.apache.org/jira/browse/HIVE-16743
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 3.0.0
>
> Attachments: HIVE-16743.1.patch
>
>
> The second line is problematic
> {code}
> BitSet bitSet = new BitSet(exceptions.length);
> bitSet.set(0, bitSet.length()); // for ValidCompactorTxnList, everything 
> in exceptions are aborted
> {code}
> For example, exceptions' length is 2. We declare a BitSet object with initial 
> size of 2 via the first line above. But that's not the actual size of the 
> BitSet. So bitSet.length() will still return 0.
> The intention of the second line above is to set all the bits to true. This 
> was not achieved because bitSet.set(0, bitSet.length()) is equivalent to 
> bitSet.set(0, 0).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027191#comment-16027191
 ] 

Hive QA commented on HIVE-16779:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870179/HIVE-16779.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10784 tests 
executed
*Failed tests:*
{noformat}
TestThriftCLIServiceWithBinary - did not produce a TEST-*.xml file (likely 
timed out) (batchId=222)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5454/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5454/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5454/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870179 - PreCommit-HIVE-Build

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2017-05-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027188#comment-16027188
 ] 

Lefty Leverenz commented on HIVE-11531:
---

Yes please, [~dmarkovitz] -- thanks for offering.

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, 
> HIVE-11531.04.patch, HIVE-11531.05.patch, HIVE-11531.06.patch, 
> HIVE-11531.07.patch, HIVE-11531.patch, HIVE-11531.WIP.1.patch, 
> HIVE-11531.WIP.2.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027170#comment-16027170
 ] 

Hive QA commented on HIVE-16777:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870164/HIVE-16777.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5453/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5453/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5453/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870164 - PreCommit-HIVE-Build

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027155#comment-16027155
 ] 

Hive QA commented on HIVE-16589:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870161/HIVE-16589.094.patch

{color:green}SUCCESS:{color} +1 due to 27 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10787 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_12] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] 
(batchId=61)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_12]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_distinct_gby]
 (batchId=158)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_12] 
(batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15] 
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=120)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5452/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5452/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5452/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870161 - PreCommit-HIVE-Build

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases

2017-05-26 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027139#comment-16027139
 ] 

liyunzhang_intel commented on HIVE-16600:
-

[~lirui]: the algorithms you provided seems ok except 1 point
here op has branches means op.getChildOperators().size()>1? if there are more 
than 1 child, parallel order is not enabled if it is a non multi-insert case?  
I don't know whether there is a non multi insert order case which contains  an 
operator which has more than 1 child  or not.

{code}
RS rs;
Operator op=rs;
while(op!=null){
  if(op instanceof LIM){
return false;
  }
  if((op instanceof RS && op!=rs) || op instanceof FS){
return true;
  }
  if(op has branches){  
return isMultiInsert;
  }
  op=op.child;
}
return true;
{code}


> Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel 
> order by in multi_insert cases
> 
>
> Key: HIVE-16600
> URL: https://issues.apache.org/jira/browse/HIVE-16600
> Project: Hive
>  Issue Type: Sub-task
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, 
> HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, 
> HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, 
> HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600
>
>
> multi_insert_gby.case.q
> {code}
> set hive.exec.reducers.bytes.per.reducer=256;
> set hive.optimize.sampling.orderby=true;
> drop table if exists e1;
> drop table if exists e2;
> create table e1 (key string, value string);
> create table e2 (key string);
> FROM (select key, cast(key as double) as keyD, value from src order by key) a
> INSERT OVERWRITE TABLE e1
> SELECT key, value
> INSERT OVERWRITE TABLE e2
> SELECT key;
> select * from e1;
> select * from e2;
> {code} 
> the parallelism of Sort is 1 even we enable parallel order 
> by("hive.optimize.sampling.orderby" is set as "true").  This is not 
> reasonable because the parallelism  should be calcuated by  
> [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170]
> this is because SetSparkReducerParallelism#needSetParallelism returns false 
> when [children size of 
> RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207]
>  is greater than 1.
> in this case, the children size of {{RS[2]}} is two.
> the logical plan of the case
> {code}
>TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5]
> -SEL[6]-FS[7]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16654:
---
Status: Open  (was: Patch Available)

> Optimize a combination of avg(), sum(), count(distinct) etc
> ---
>
> Key: HIVE-16654
> URL: https://issues.apache.org/jira/browse/HIVE-16654
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, 
> HIVE-16654.03.patch, HIVE-16654.04.patch
>
>
> an example rewrite for q28 of tpcds is 
> {code}
> (select LP as B1_LP ,CNT  as B1_CNT,CNTD as B1_CNTD
>   from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as 
> CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from 
> store_sales  where 
> ss_list_price is not null and ss_quantity between 0 and 5
> and (ss_list_price between 11 and 11+10 
>  or ss_coupon_amt between 460 and 460+1000
>  or ss_wholesale_cost between 14 and 14+20)
>  group by ss_list_price) ss0) ss1) B1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16654:
---
Attachment: HIVE-16654.04.patch

> Optimize a combination of avg(), sum(), count(distinct) etc
> ---
>
> Key: HIVE-16654
> URL: https://issues.apache.org/jira/browse/HIVE-16654
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, 
> HIVE-16654.03.patch, HIVE-16654.04.patch
>
>
> an example rewrite for q28 of tpcds is 
> {code}
> (select LP as B1_LP ,CNT  as B1_CNT,CNTD as B1_CNTD
>   from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as 
> CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from 
> store_sales  where 
> ss_list_price is not null and ss_quantity between 0 and 5
> and (ss_list_price between 11 and 11+10 
>  or ss_coupon_amt between 460 and 460+1000
>  or ss_wholesale_cost between 14 and 14+20)
>  group by ss_list_price) ss0) ss1) B1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16654:
---
Status: Patch Available  (was: Open)

> Optimize a combination of avg(), sum(), count(distinct) etc
> ---
>
> Key: HIVE-16654
> URL: https://issues.apache.org/jira/browse/HIVE-16654
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, 
> HIVE-16654.03.patch, HIVE-16654.04.patch
>
>
> an example rewrite for q28 of tpcds is 
> {code}
> (select LP as B1_LP ,CNT  as B1_CNT,CNTD as B1_CNTD
>   from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as 
> CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from 
> store_sales  where 
> ss_list_price is not null and ss_quantity between 0 and 5
> and (ss_list_price between 11 and 11+10 
>  or ss_coupon_amt between 460 and 460+1000
>  or ss_wholesale_cost between 14 and 14+20)
>  group by ss_list_price) ss0) ss1) B1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16779:
--
Attachment: (was: HIVE-16779.2.patch)

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027115#comment-16027115
 ] 

Thejas M Nair commented on HIVE-16779:
--

+1

minor nit : LOG.error("Error shutting down RawStore",e ); can you add a space 
before "e" and remove space after e :)



> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16779:
--
Attachment: HIVE-16779.2.patch

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027114#comment-16027114
 ] 

Hive QA commented on HIVE-16771:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870160/HIVE-16771.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_replicate_rows] 
(batchId=79)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5451/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5451/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5451/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870160 - PreCommit-HIVE-Build

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch, HIVE-16771.02.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16779:
--
Attachment: HIVE-16779.2.patch

Sounds good. Attach HIVE-16779.2.patch.

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027109#comment-16027109
 ] 

Thejas M Nair commented on HIVE-16779:
--

Add it to finally block ?
{code}
finally {
 try {
 rawStore.shutdown();
 } catch (Exception e){ 
   LOG.error("Error shutting down RawStore",e );
 }
}
{code}

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator

2017-05-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027103#comment-16027103
 ] 

Sergey Shelukhin commented on HIVE-16761:
-

The call is in next
{noformat}
nextValue(batch.cols[i], rowInBatch, schema.get(i), getStructCol(value, i)))
{noformat}
Schema is created from the vrbCtx
{noformat}
schema = Lists.newArrayList(vrbCtx.getRowColumnTypeInfos());
{noformat}
The ctx is the same one passed from the LlapReader... created via 
"LlapInputFormat.createFakeVrbCtx(mapWork);" for the non-vectorized map work 
case, as I assume is the case here.
I suspect the problem is that the latter is incorrect for this case.
{noformat}
static VectorizedRowBatchCtx createFakeVrbCtx(MapWork mapWork) throws 
HiveException {
// This is based on Vectorizer code, minus the validation.

// Add all non-virtual columns from the TableScan operator.
RowSchema rowSchema = findTsOp(mapWork).getSchema();
final List colNames = new 
ArrayList(rowSchema.getSignature().size());
final List colTypes = new 
ArrayList(rowSchema.getSignature().size());
for (ColumnInfo c : rowSchema.getSignature()) {
  String columnName = c.getInternalName();
  if (VirtualColumn.VIRTUAL_COLUMN_NAMES.contains(columnName)) continue;
  colNames.add(columnName);
  colTypes.add(TypeInfoUtils.getTypeInfoFromTypeString(c.getTypeName()));
}

// Determine the partition columns using the first partition descriptor.
// Note - like vectorizer, this assumes partition columns go after data 
columns.
int partitionColumnCount = 0;
Iterator paths = mapWork.getPathToAliases().keySet().iterator();
if (paths.hasNext()) {
  PartitionDesc partDesc = 
mapWork.getPathToPartitionInfo().get(paths.next());
  if (partDesc != null) {
LinkedHashMap partSpec = partDesc.getPartSpec();
if (partSpec != null && partSpec.isEmpty()) {
  partitionColumnCount = partSpec.size();
}
  }
}
return new VectorizedRowBatchCtx(colNames.toArray(new 
String[colNames.size()]),
colTypes.toArray(new TypeInfo[colTypes.size()]), null, 
partitionColumnCount, new String[0]);
  }
{noformat}
[~jdere] [~gopalv] does SMB join do something special wrt columns?
Also, I see a bug right there with partition column count. I wonder if that 
could be related...


> LLAP IO: SMB joins fail elevator 
> -
>
> Key: HIVE-16761
> URL: https://issues.apache.org/jira/browse/HIVE-16761
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>
> {code}
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153)
>   at 
> org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>   ... 26 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
>   at 
> org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334)
>   at 
> org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602)
>   at 
> org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149)
>   ... 28 more
> {code}
> {code}
> set hive.enforce.sortmergebucketmapjoin=false;
> set hive.optimize.bucketmapjoin=true;
> set hive.optimize.bucketmapjoin.sortedmerge=true;
> set hive.auto.convert.sortmerge.join=true;
> set hive.auto.convert.join=true;
> set hive.auto.convert.join.noconditionaltask.size=500;
> select year,quarter,count(*) from transactions_raw_orc_200 a join 
> customer_accounts_orc_200 b on a.account_id=b.account_id group by 
> year,quarter;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16778:

Attachment: HIVE-16778.patch

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16778.patch, HIVE-16778.patch
>
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027079#comment-16027079
 ] 

Sergey Shelukhin commented on HIVE-16778:
-

[~prasanth_j] can you take a look? https://reviews.apache.org/r/59615/
This improves error handling for refcounts, and also removes the per-column RG 
arrays (that are the legacy of high-level cache).

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16778.patch
>
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16778:

Status: Patch Available  (was: Open)

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16778.patch
>
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16778:

Attachment: HIVE-16778.patch

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16778.patch
>
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16778:

Summary: LLAP IO: better refcount management  (was: LLAP IO: better 
refcount management I)

> LLAP IO: better refcount management
> ---
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027072#comment-16027072
 ] 

Hive QA commented on HIVE-15665:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870156/HIVE-15665.02.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10793 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_17] 
(batchId=82)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5450/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5450/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5450/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870156 - PreCommit-HIVE-Build

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-05-26 Thread Colin Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-16765:

Attachment: HIVE-16765-branch-2.3.patch

Upload patch for branch-2.3.

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16765.001.patch, HIVE-16765-branch-2.3.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16779:
--
Status: Patch Available  (was: Open)

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027062#comment-16027062
 ] 

Siddharth Seth commented on HIVE-16777:
---

True. Multiple across hosts would need to be handled. The client needs to stop 
creating an umbilical per fragment.

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027051#comment-16027051
 ] 

Daniel Dai commented on HIVE-16323:
---

Unit tests pass.

> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027050#comment-16027050
 ] 

Sergey Shelukhin edited comment on HIVE-16777 at 5/27/17 12:08 AM:
---

+1 as a quick fix... however
1) it would be better to add field to the protocol promising single AM (or not, 
as in case with external interface); relying on external flag itself is hacky 
since, potentially, other clients could use external record reader from a 
single "AM", allowing them to utilize the UGI pool.
2) also, the comment might not be correct w.r.t. it being temporary; even if 
Spark thing uses a single port per instance, it can still have multiple spark 
tasks for the same query as far as I understand. That would make this solution 
permanent :) 


was (Author: sershe):
+1 as a quick fix... however
1) it would be better to add field to the protocol promising single AM (or not, 
as in case with external interface); relying on external flag itself is hacky.
2) also, the comment might not be correct w.r.t. it being temporary; even if 
Spark thing uses a single port per instance, it can still have multiple spark 
tasks for the same query as far as I understand. That would make this solution 
permanent :) 

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027050#comment-16027050
 ] 

Sergey Shelukhin commented on HIVE-16777:
-

+1 as a quick fix... however
1) it would be better to add field to the protocol promising single AM (or not, 
as in case with external interface); relying on external flag itself is hacky.
2) also, the comment might not be correct w.r.t. it being temporary; even if 
Spark thing uses a single port per instance, it can still have multiple spark 
tasks for the same query as far as I understand. That would make this solution 
permanent :) 

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16779:
--
Attachment: HIVE-16779.1.patch

> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16779.1.patch
>
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned HIVE-16779:
-


> CachedStore refresher leak PersistenceManager resources
> ---
>
> Key: HIVE-16779
> URL: https://issues.apache.org/jira/browse/HIVE-16779
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>
> See OOM when running CachedStore. We didn't shutdown rawstore in refresh 
> thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16778) LLAP IO: better refcount management I

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16778:
---


> LLAP IO: better refcount management I
> -
>
> Key: HIVE-16778
> URL: https://issues.apache.org/jira/browse/HIVE-16778
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Looks like task cancellation can close the UGI, causing the background thread 
> to die with an exception, leaving a bunch of unreleased cache buffers.
> Overall, it's probably better to modify how refcounts are handled - if 
> there's some bug in the code we don't want to leak them. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-05-26 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027045#comment-16027045
 ] 

Pengcheng Xiong commented on HIVE-16765:


sure. please run ptest before you commit. thanks!

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027040#comment-16027040
 ] 

Ferdinand Xu commented on HIVE-16765:
-

Hi [~pxiong] can we make this committed to branch 2.3? It's important to the 
feature Parquet vectorization. Thank you!

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-16777:
--
Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-16777:
--
Attachment: HIVE-16777.01.patch

cc [~sershe] for review.

> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16777.01.patch
>
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-05-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-16765:

Affects Version/s: 3.0.0

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027034#comment-16027034
 ] 

Hive QA commented on HIVE-16323:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870134/HIVE-16323.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5448/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5448/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5448/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870134 - PreCommit-HIVE-Build

> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used

2017-05-26 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-16777:
-


> LLAP: Use separate tokens and UGI instances when an external client is used
> ---
>
> Key: HIVE-16777
> URL: https://issues.apache.org/jira/browse/HIVE-16777
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> Otherwise leads to errors since the token is shared, and there's different 
> nodes running Umbilical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak

2017-05-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-16765:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to the upstream. Thx for the contribution.

> ParquetFileReader should be closed to avoid resource leak
> -
>
> Key: HIVE-16765
> URL: https://issues.apache.org/jira/browse/HIVE-16765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16765.001.patch
>
>
> ParquetFileReader should be closed to avoid resource leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Status: Patch Available  (was: In Progress)

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16771:
---
Attachment: HIVE-16771.02.patch

Attached a second version of the patch based on my discussion with Naveen. This 
version passes the connection information needed to the method instead of the 
connection object.

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch, HIVE-16771.02.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-16549.
--
Resolution: Fixed

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-16549.patch
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Attachment: HIVE-16589.094.patch

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Attachment: (was: HIVE-16589.094.patch)

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026993#comment-16026993
 ] 

Sergio Peña commented on HIVE-16771:


I agree with [~ngangam] that passing a Connection object to the interface 
doesn't guarantee that the implementation will close the connection. But if 
this is a helper class, shouldn't we allow that contract?

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16285:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Test failures are unrelated. Committed to master. Thanks Gopal for the review!

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026954#comment-16026954
 ] 

Hive QA commented on HIVE-16285:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870144/HIVE-16285.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5446/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5446/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5446/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870144 - PreCommit-HIVE-Build

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage

2017-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15665:

Attachment: HIVE-15665.02.patch

Fixing some small refcount issues

> LLAP: OrcFileMetadata objects in cache can impact heap usage
> 
>
> Key: HIVE-15665
> URL: https://issues.apache.org/jira/browse/HIVE-15665
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, 
> HIVE-15665.patch
>
>
> OrcFileMetadata internally has filestats, stripestats etc which are allocated 
> in heap. On large data sets, this could have an impact on the heap usage and 
> the memory usage by different executors in LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16285:
-
Attachment: HIVE-16285.7.patch

brought back unrelated changes.

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Attachment: HIVE-16589.094.patch

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.094.patch, HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE

2017-05-26 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16589:

Status: In Progress  (was: Patch Available)

> Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and 
> COMPLETE  for AVG, VARIANCE
> ---
>
> Key: HIVE-16589
> URL: https://issues.apache.org/jira/browse/HIVE-16589
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, 
> HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, 
> HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, 
> HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, 
> HIVE-16589.09.patch
>
>
> Allow Complex Types to be vectorized (since HIVE-16207: "Add support for 
> Complex Types in Fast SerDe" was committed).
> Add more classes we vectorize AVG in preparation for fully supporting AVG 
> GroupBy.  In particular, the PARTIAL2 and FINAL groupby modes that take in 
> the AVG struct as input.  And, add the COMPLETE mode that takes in the 
> Original data and produces the Full Aggregation for completeness, so to speak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026909#comment-16026909
 ] 

Gopal V commented on HIVE-16285:


The patch removes several unrelated fields from classes - that is probably a 
bad idea to do in a functional patch which has nothing to do with those fields 
(isOuterJoin for instance).

Other than that, LGTM - +1.

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2017-05-26 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14990:
-
Attachment: HIVE-14990.19.patch

Upload patch 19 for testing.

The reason why there were ~4000 tests not being run previously is due to setup 
phase failure in q_test_init.sql. LOAD command failed with
{code}
FAILED: SemanticException [Error 10265]: This command is not allowed on an ACID 
table default.src with a non-ACID transaction manager.
{code}

Adding the txn manager settings and try again.

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Fix For: hive-14535
>
> Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, 
> HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, 
> HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, 
> HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, 
> HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, 
> HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.13.patch, 
> HIVE-14990.14.patch, HIVE-14990.15.patch, HIVE-14990.16.patch, 
> HIVE-14990.17.patch, HIVE-14990.18.patch, HIVE-14990.19.patch, 
> HIVE-14990.patch
>
>
> I am running the tests with isMmTable returning true for most tables (except 
> ACID, temporary tables, views, etc.).
> Many tests will fail because of various expected issues with such an 
> approach; however we can find issues in MM tables from other failures.
> Expected failures 
> 1) All HCat tests (cannot write MM tables via the HCat writer)
> 2) Almost all merge tests (alter .. concat is not supported).
> 3) Tests that run dfs commands with specific paths (path changes).
> 4) Truncate column (not supported).
> 5) Describe formatted will have the new table fields in the output (before 
> merging MM with ACID).
> 6) Many tests w/explain extended - diff in partition "base file name" (path 
> changes).
> 7) TestTxnCommands - all the conversion tests, as they check for bucket count 
> using file lists (path changes).
> 8) HBase metastore tests cause methods are not implemented.
> 9) Some load and ExIm tests that export a table and then rely on specific 
> path for load (path changes).
> 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due 
> to how it accounts for buckets
> 11) rand - different results due to different sequence of processing.
> 12) many (not all i.e. not the ones with just one insert) tests that have 
> stats output, such as file count, for obvious reasons
> 13) materialized views, not handled by design - the test check erroneously 
> makes them "mm", no easy way to tell them apart, I don't want to plumb more 
> stuff thru just for this test
> I'm filing jiras for some test failures that are not obvious and need an 
> investigation later



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels

2017-05-26 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16285:
-
Attachment: HIVE-16285.6.patch

Rebased.

[~gopalv] can you please take a look?

> Servlet for dynamically configuring log levels
> --
>
> Key: HIVE-16285
> URL: https://issues.apache.org/jira/browse/HIVE-16285
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, 
> HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, 
> HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch
>
>
> Many long running services like HS2, LLAP etc. will benefit from having an 
> endpoint to dynamically change log levels for various loggers. This will help 
> greatly with debuggability without requiring a restart of the service. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16343) LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring

2017-05-26 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16343:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Will create a follow up in case of any perf issues. Committed to master. Thanks 
Sid for the reviews!

> LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring
> 
>
> Key: HIVE-16343
> URL: https://issues.apache.org/jira/browse/HIVE-16343
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-16343.1.patch, HIVE-16343.2.patch
>
>
> Publish MemInfo from ProcfsBasedProcessTree to llap metrics. This will useful 
> for monitoring and also setting up triggers via JMC. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026858#comment-16026858
 ] 

Hive QA commented on HIVE-16323:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870134/HIVE-16323.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] 
(batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5445/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5445/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5445/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870134 - PreCommit-HIVE-Build

> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026792#comment-16026792
 ] 

Vihang Karajgaonkar commented on HIVE-16771:


Thanks for the review [~ngangam] My comments inline below.
bq. 1) However, it seems a bit odd to have it take a boolean to determine if 
the query needs quotes or not. Can the implementation detect it without a whole 
lot of code duplication? The impl should be able to determine the DBTYPE just 
as easily.
In order to fetch schema version from DB without providing the connection 
object, it will need to create its own connection. The information needed and 
the utility class {{HiveSchemaHelper}} is in the BeeLine module and cannot be 
used in Metastore module.
bq. 2) The connection and the statement are not closed. This is will certainly 
cause a memory leak and a potentially a connection leak to the DB.
This is just a refactor of the existing code so the connection leak possbility 
was pre-existing. Let me update the patch to fix it.
bq. 3) Same with the need to have an active SQL connection passed in. But then 
is there a better means to do this?
We can either pass the username, password, url, driver to the class to make its 
own connection or just pass the row information from the Version table. Let me 
know if you have any better ideas.
bq. 4) Ideally, the HMS schema version should only be fetched from DB just 
once. This implementation fetches it every time. Are there scenarios where the 
value would be changed after initialization that makes it necessary every time?
HiveSchemaTool calls it multiple times eg. before upgrade and after upgrade to 
validate if the schema is correct after the update. Not sure if we can cache 
this information due to this reason.

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026793#comment-16026793
 ] 

Vihang Karajgaonkar commented on HIVE-16771:


[~spena] [~stakiar] Can you please take a look as well?

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16323:
--
Attachment: HIVE-16323.2.patch

Not sure why ptest is teting PM_leak.png. Reattach the patch with a later 
timestamp.

> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026761#comment-16026761
 ] 

Hive QA commented on HIVE-16644:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870124/HIVE-16644.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10790 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5444/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5444/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5444/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870124 - PreCommit-HIVE-Build

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.replaceFiles is called to replace contents of 
> existing partitions/table. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16745) Syntax error in 041-HIVE-16556.mysql.sql script

2017-05-26 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16745:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Fix has been committed to master for 3.0.0.

> Syntax error in 041-HIVE-16556.mysql.sql script
> ---
>
> Key: HIVE-16745
> URL: https://issues.apache.org/jira/browse/HIVE-16745
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16745.01.patch
>
>
> 041-HIVE-16556.mysql.sql has a syntax error which was introduced with 
> HIVE-16711



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16767) Update people website with recent changes

2017-05-26 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026696#comment-16026696
 ] 

Yongzhi Chen commented on HIVE-16767:
-

Mine is right. Thanks

> Update people website with recent changes
> -
>
> Key: HIVE-16767
> URL: https://issues.apache.org/jira/browse/HIVE-16767
> Project: Hive
>  Issue Type: Task
>  Components: Documentation
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-16767.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16644:

Status: Patch Available  (was: In Progress)

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.replaceFiles is called to replace contents of 
> existing partitions/table. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16644:

Component/s: Hive

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.replaceFiles is called to replace contents of 
> existing partitions/table. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16644:

Description: For insert overwrite Hive.replaceFiles is called to replace 
contents of existing partitions/table. This should trigger move of old files 
into $CMROOT.  (was: For insert overwrite Hive.moveFile is called to replace 
contents of existing partitions. This should trigger move of old files into 
$CMROOT.)

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.replaceFiles is called to replace contents of 
> existing partitions/table. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026678#comment-16026678
 ] 

Sankar Hariappan edited comment on HIVE-16644 at 5/26/17 6:59 PM:
--

Added 01.patch with below changes.
- Added a metastore api cm_recycle to move files to CM path before trashing it.
- The destination directory is recycled in Hive.replaceFIles before trashing it.
- For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will 
be same and the trashing of the files in this directory happens in 
Hive.replaceFiles. So, no handling required in Hive.moveFile.

Request [~anishek] to review the patch!
cc [~thejas],[~sushanth]


was (Author: sankarh):
Added 01.patch with below changes.
- Added a metastore api cm_recycle to move files to CM path before trashing it.
- The destination directory is recycled in Hive.replaceFIles before trashing it.
- For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will 
be same and the trashing of the files in this directory happens in 
Hive.replaceFiles. So, no handling required in Hive.moveFile.

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.moveFile is called to replace contents of existing 
> partitions. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16644:

Attachment: HIVE-16644.01.patch

Added 01.patch with below changes.
- Added a metastore api cm_recycle to move files to CM path before trashing it.
- The destination directory is recycled in Hive.replaceFIles before trashing it.
- For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will 
be same and the trashing of the files in this directory happens in 
Hive.replaceFiles. So, no handling required in Hive.moveFile.

> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Attachments: HIVE-16644.01.patch
>
>
> For insert overwrite Hive.moveFile is called to replace contents of existing 
> partitions. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16764) Support numeric as same as decimal

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026672#comment-16026672
 ] 

Hive QA commented on HIVE-16764:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870103/HIVE-16764.02.patch

{color:green}SUCCESS:{color} +1 due to 92 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10801 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5443/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5443/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5443/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870103 - PreCommit-HIVE-Build

> Support numeric as same as decimal
> --
>
> Key: HIVE-16764
> URL: https://issues.apache.org/jira/browse/HIVE-16764
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch
>
>
> for example numeric(12,2) -> decimal(12,2) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16644) Hook Change Manager to Insert Overwrite

2017-05-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026668#comment-16026668
 ] 

ASF GitHub Bot commented on HIVE-16644:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/190

HIVE-16644: Hook Change Manager to Insert Overwrite

Change management for insert overwrite to a table or a partition.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-16644

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/190.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #190


commit 0e7906c2662cceaf490e11698a77fa1cb8fd90cb
Author: Sankar Hariappan 
Date:   2017-05-24T13:31:17Z

HIVE-16644: Hook Change Manager to Insert Overwrite




> Hook Change Manager to Insert Overwrite
> ---
>
> Key: HIVE-16644
> URL: https://issues.apache.org/jira/browse/HIVE-16644
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
>
> For insert overwrite Hive.moveFile is called to replace contents of existing 
> partitions. This should trigger move of old files into $CMROOT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16770) Concatinate is not working on Table/Partial Partition level

2017-05-26 Thread Kallam Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kallam Reddy updated HIVE-16770:

Description: 
Not able to CONCATENATE at Table/Partial partition levels. I have table test 
partitioned on year, month and date. If I try to concatenate by providing 
corresponding year, month and date of partition it is working fine, but when I 
want to concatenate ORC files for all the sub partition corresponding to year 
and month, it is giving exception.

hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE;
FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
Partition not found {year=2017, month=05}

hive> ALTER TABLE test CONCATENATE;
FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
source table test is partitioned but no partition desc found.

I am expecting this to trigger concatenate in all available sub partitions.

  was:
Not able to CONCATENATE at Table/Partial partition levels. I have table test 
partitioned on year, month and date. If I try to concatenate by providing 
corresponding year, month and date of partition it is working fine, but when I 
want to concatenate ORC files for all the sub partition corresponding to year 
and month, it is giving exception.

hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE;
FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
Partition not found {year=2017, month=05}

hive> ALTER TABLE otda_es_orc_p1 CONCATENATE;
FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
source table otdadb.otda_es_orc_p1 is partitioned but no partition desc found.

I am expecting this to trigger concatenate in all available sub partitions.


> Concatinate is not working on Table/Partial Partition level 
> 
>
> Key: HIVE-16770
> URL: https://issues.apache.org/jira/browse/HIVE-16770
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1
> Environment: centOS7
>Reporter: Kallam Reddy
>
> Not able to CONCATENATE at Table/Partial partition levels. I have table test 
> partitioned on year, month and date. If I try to concatenate by providing 
> corresponding year, month and date of partition it is working fine, but when 
> I want to concatenate ORC files for all the sub partition corresponding to 
> year and month, it is giving exception.
> hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> Partition not found {year=2017, month=05}
> hive> ALTER TABLE test CONCATENATE;
> FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: 
> source table test is partitioned but no partition desc found.
> I am expecting this to trigger concatenate in all available sub partitions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16776) Strange cast behavior for table backed by druid

2017-05-26 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-16776:
--


> Strange cast behavior for table backed by druid
> ---
>
> Key: HIVE-16776
> URL: https://issues.apache.org/jira/browse/HIVE-16776
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: slim bouguerra
>Assignee: Jesus Camacho Rodriguez
>
> The following query 
> {code} 
> explain select SUBSTRING(`Calcs`.`str0`,CAST(`Calcs`.`int2` AS int), 3) from 
> `druid_tableau`.`calcs` `Calcs`;
> OK
> Plan not optimized by CBO. 
> {code}
> fails the cbo with the following exception 
> {code} org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Wrong 
> arguments '3': No matching method for class 
> org.apache.hadoop.hive.ql.udf.UDFSubstr with (string, bigint, int). Po
> ssible choices: _FUNC_(binary, int)  _FUNC_(binary, int, int)  _FUNC_(string, 
> int)  _FUNC_(string, int, int)
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1355)
>  ~[hive-exec-2.1.0.2.6.0.2-SNAPSHOT.jar:2.1.0.2.6.0.2-SNA
> PSHOT]{code}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16706) Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when dump in progress.

2017-05-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026635#comment-16026635
 ] 

ASF GitHub Bot commented on HIVE-16706:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/187


> Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when 
> dump in progress.
> -
>
> Key: HIVE-16706
> URL: https://issues.apache.org/jira/browse/HIVE-16706
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-16706.01.patch
>
>
> Currently, bootstrap REPL DUMP gets the partitions in a batch and then 
> iterate through it. If any partition is dropped/renamed during iteration, it 
> may lead to failure/exception. In this case, the partition should be skipped 
> from dump and also need to ensure no failure of REPL DUMP and the subsequent 
> incremental dump should ensure the consistent state of the table.
> This bug is related to HIVE-16684.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2017-05-26 Thread Dudu Markovitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026629#comment-16026629
 ] 

Dudu Markovitz commented on HIVE-11531:
---

[~leftylev], It seems the offset feature is not documented. Would you like me 
to do it?

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, 
> HIVE-11531.04.patch, HIVE-11531.05.patch, HIVE-11531.06.patch, 
> HIVE-11531.07.patch, HIVE-11531.patch, HIVE-11531.WIP.1.patch, 
> HIVE-11531.WIP.2.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026581#comment-16026581
 ] 

Prasanth Jayachandran commented on HIVE-16549:
--

+1 on the patch

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-16549.patch
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reopened HIVE-16549:
--

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-16549.patch
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16764) Support numeric as same as decimal

2017-05-26 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026558#comment-16026558
 ] 

Pengcheng Xiong commented on HIVE-16764:


create_merge_compressed, subquery_scalar are shown in the previous failed 
tests. can not repo columnstats_part_coltype. query74 should be removed. 
query14 is flaky. [~ashutoshc], could u please review? thanks.

> Support numeric as same as decimal
> --
>
> Key: HIVE-16764
> URL: https://issues.apache.org/jira/browse/HIVE-16764
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch
>
>
> for example numeric(12,2) -> decimal(12,2) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16764) Support numeric as same as decimal

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16764:
---
Status: Open  (was: Patch Available)

> Support numeric as same as decimal
> --
>
> Key: HIVE-16764
> URL: https://issues.apache.org/jira/browse/HIVE-16764
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch
>
>
> for example numeric(12,2) -> decimal(12,2) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16764) Support numeric as same as decimal

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16764:
---
Attachment: HIVE-16764.02.patch

> Support numeric as same as decimal
> --
>
> Key: HIVE-16764
> URL: https://issues.apache.org/jira/browse/HIVE-16764
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch
>
>
> for example numeric(12,2) -> decimal(12,2) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16764) Support numeric as same as decimal

2017-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16764:
---
Status: Patch Available  (was: Open)

> Support numeric as same as decimal
> --
>
> Key: HIVE-16764
> URL: https://issues.apache.org/jira/browse/HIVE-16764
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch
>
>
> for example numeric(12,2) -> decimal(12,2) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-16549:
-
Attachment: HIVE-16549.patch

Here's the patch.

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-16549.patch
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026537#comment-16026537
 ] 

Thejas M Nair edited comment on HIVE-16323 at 5/26/17 5:25 PM:
---

Looks like this needs to be rebased.



was (Author: thejas):
+1 pending tests


> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-05-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026537#comment-16026537
 ] 

Thejas M Nair commented on HIVE-16323:
--

+1 pending tests


> HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
> ---
>
> Key: HIVE-16323
> URL: https://issues.apache.org/jira/browse/HIVE-16323
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-16323.1.patch, PM_leak.png
>
>
> Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
> never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026522#comment-16026522
 ] 

Hive QA commented on HIVE-16771:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870089/HIVE-16771.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part5]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5442/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5442/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5442/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870089 - PreCommit-HIVE-Build

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026506#comment-16026506
 ] 

Gunther Hagleitner commented on HIVE-16549:
---

HIVE-14007 is a massive patch. This patch here doesn't seem to have any reviews 
or test runs or anything. Patch isn't even attached to the ticker. 
[~owen.omalley] why do the regular rules not apply to you?

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026487#comment-16026487
 ] 

Naveen Gangam commented on HIVE-16771:
--

I think it makes sense to have the getMetaStoreSchemaVersion() API on the 
{{IMetaStoreSchemaInfo}}. 
1) However, it seems a bit odd to have it take a boolean to determine if the 
query needs quotes or not. Can the implementation detect it without a whole lot 
of code duplication? The impl should be able to determine the DBTYPE just as 
easily.
2) The connection and the statement are not closed. This is will certainly 
cause a memory leak and a potentially a connection leak to the DB.
3) Same with the need to have an active SQL connection passed in. But then is 
there a better means to do this?
4) Ideally, the HMS schema version should only be fetched from DB just once. 
This implementation fetches it every time. Are there scenarios where the value 
would be changed after initialization that makes it necessary every time?
Thanks

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16771:
---
Status: Patch Available  (was: Open)

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16771:
---
Attachment: HIVE-16771.01.patch

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026451#comment-16026451
 ] 

Vihang Karajgaonkar commented on HIVE-16771:


Hi [~ngangam] Can you please review?

> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-16771.01.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16769) Possible hive service startup due to the existing file /tmp/stderr

2017-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026448#comment-16026448
 ] 

Hive QA commented on HIVE-16769:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870079/HIVE-16769.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5441/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5441/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5441/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870079 - PreCommit-HIVE-Build

> Possible hive service startup due to the existing file /tmp/stderr
> --
>
> Key: HIVE-16769
> URL: https://issues.apache.org/jira/browse/HIVE-16769
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16769.1.patch
>
>
> HIVE-12497 prints the ignoring errors from hadoop version, hbase mapredcp and 
> hadoop jars to /tmp/$USER/stderr. 
> In some cases $USER is not set, then the file becomes /tmp/stderr.  If  such 
> file preexists with different permission, it will cause the service startup 
> to fail.
> I just tried the script without outputting to stderr file, I don't see such 
> error any more {{"ERROR StatusLogger No log4j2 configuration file found. 
> Using default configuration: logging only errors to the console."}}.
> I think we can remove such redirect to avoid possible startup failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database

2017-05-26 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16771:
--


> Schematool should use MetastoreSchemaInfo to get the metastore schema version 
> from database
> ---
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo 
> implementation to manage schema upgrades and initialization if needed. In 
> order to make HiveSchemaTool completely agnostic it should depend on 
> IMetastoreSchemaInfo implementation which is configured to get the metastore 
> schema version information from the database. It should also not assume the 
> scripts directory and hardcode it itself. It would rather ask 
> MetastoreSchemaInfo class to get the metastore scripts directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HIVE-16549.
--
Resolution: Fixed

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269

2017-05-26 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-16549:
-
Fix Version/s: 2.2.0
  Component/s: storage-api

This has already been fixed in HIVE-14007 for master and branch-2.3. I need a 
modified fix for branch-2.2.

> Fix an incompatible change in PredicateLeafImpl from HIVE-15269
> ---
>
> Key: HIVE-16549
> URL: https://issues.apache.org/jira/browse/HIVE-16549
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
>
> HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a 
> configuration object. The configuration object is only used for the new 
> LiteralDelegates.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16767) Update people website with recent changes

2017-05-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026356#comment-16026356
 ] 

Sergio Peña commented on HIVE-16767:


Mine looks good.
+1

> Update people website with recent changes
> -
>
> Key: HIVE-16767
> URL: https://issues.apache.org/jira/browse/HIVE-16767
> Project: Hive
>  Issue Type: Task
>  Components: Documentation
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-16767.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >