[jira] [Commented] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679384#comment-16679384
 ] 

Hive QA commented on HIVE-20682:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947256/HIVE-20682.06.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=85)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14804/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14804/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14804/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947256 - PreCommit-HIVE-Build

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-07 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679101#comment-16679101
 ] 

Jesus Camacho Rodriguez commented on HIVE-20804:


+1 (pending tests)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch, HIVE-20804.5.patch, 
> HIVE-20804.6.patch, HIVE-20804.7.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20845) Fix TestJdbcWithDBTokenStoreNoDoAs flakiness

2018-11-07 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20845:
--
Attachment: HIVE-20845.patch

> Fix TestJdbcWithDBTokenStoreNoDoAs flakiness
> 
>
> Key: HIVE-20845
> URL: https://issues.apache.org/jira/browse/HIVE-20845
> Project: Hive
>  Issue Type: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20845.patch
>
>
> Previously did a dirty fix for TestJdbcWithDBTokenStoreNoDoAs and 
> TestJdbcWithDBTokenStore
> Found out the issue is that we do not wait enough for HS2 to come up.
> Need to fix in MiniHS2.waitForStartup()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679108#comment-16679108
 ] 

Hive QA commented on HIVE-20440:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
36s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/hive-unit: The patch generated 5 new + 0 
unchanged - 0 fixed = 5 total (was 0) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} ql: The patch generated 0 new + 53 unchanged - 2 
fixed = 53 total (was 55) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} ql generated 0 new + 2314 unchanged - 1 fixed = 2314 
total (was 2315) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14799/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14799/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14799/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: itests/hive-unit ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14799/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits

[jira] [Updated] (HIVE-20826) Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20826:
---
Attachment: HIVE-20826.3.patch

> Enhance HiveSemiJoin rule to convert join + group by on left side to Left 
> Semi Join
> ---
>
> Key: HIVE-20826
> URL: https://issues.apache.org/jira/browse/HIVE-20826
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20826.1.patch, HIVE-20826.2.patch, 
> HIVE-20826.3.patch
>
>
> Currently HiveSemiJoin rule looks for pattern where group by is on right side.
> We can convert joins which have group by on left side (assuming group by keys 
> are same as join keys and none of the columns are being projected from left 
> side) to LEFT SEMI JOIN by swapping the inputs. e.g. queries such as:
> {code:sql}
> explain select pp.p_partkey from (select distinct p_name from part) p join 
> part pp on pp.p_name = p.p_name;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679129#comment-16679129
 ] 

Hive QA commented on HIVE-20440:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947231/HIVE-20440.10.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15533 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel_orderby] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=85)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14799/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14799/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14799/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947231 - PreCommit-HIVE-Build

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Status: Open  (was: Patch Available)

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20845) Fix TestJdbcWithDBTokenStoreNoDoAs flakiness

2018-11-07 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20845:
--
Status: Patch Available  (was: Open)

> Fix TestJdbcWithDBTokenStoreNoDoAs flakiness
> 
>
> Key: HIVE-20845
> URL: https://issues.apache.org/jira/browse/HIVE-20845
> Project: Hive
>  Issue Type: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20845.patch
>
>
> Previously did a dirty fix for TestJdbcWithDBTokenStoreNoDoAs and 
> TestJdbcWithDBTokenStore
> Found out the issue is that we do not wait enough for HS2 to come up.
> Need to fix in MiniHS2.waitForStartup()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20836) Fix TestJdbcDriver2.testYarnATSGuid flakiness

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678905#comment-16678905
 ] 

Hive QA commented on HIVE-20836:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14798/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: itests/hive-unit U: itests/hive-unit |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14798/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix TestJdbcDriver2.testYarnATSGuid flakiness
> -
>
> Key: HIVE-20836
> URL: https://issues.apache.org/jira/browse/HIVE-20836
> Project: Hive
>  Issue Type: Test
>  Components: JDBC
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20836.2.patch, HIVE-20836.3.patch, HIVE-20836.patch
>
>
> Seen flakiness in internal test.
> {code:java}
> Error Message
> Failed to set the YARN ATS Guid
> Stacktrace
> java.lang.AssertionError: Failed to set the YARN ATS Guid
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid(TestJdbcDriver2.java:2434){code}
> The query finished too fast, and the GUID thread did not try to check the 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Status: Patch Available  (was: Open)

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Attachment: (was: HIVE-20682.06.patch)

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678526#comment-16678526
 ] 

Sankar Hariappan edited comment on HIVE-20682 at 11/8/18 6:49 AM:
--

06.patch fixed test failure and findbugs issue.


was (Author: sankarh):
06.patch fixed test failure and windbags issue.

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Attachment: HIVE-20682.06.patch

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679358#comment-16679358
 ] 

Hive QA commented on HIVE-20682:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
41s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} ql: The patch generated 0 new + 218 unchanged - 2 
fixed = 218 total (was 220) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch service passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14804/dev-support/hive-personality.sh
 |
| git revision | master / 840dd43 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: itests/hive-unit ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14804/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, 

[jira] [Commented] (HIVE-20847) Review of NullScan Code

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679184#comment-16679184
 ] 

Hive QA commented on HIVE-20847:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947237/HIVE-20847.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 15524 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testCancelRenewTokenFlow 
(batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection (batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid (batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg (batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeProxyAuth 
(batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth 
(batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth (batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testRenewDelegationToken 
(batchId=273)
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth (batchId=273)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14800/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14800/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14800/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947237 - PreCommit-HIVE-Build

> Review of NullScan Code
> ---
>
> Key: HIVE-20847
> URL: https://issues.apache.org/jira/browse/HIVE-20847
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-20847.1.patch, HIVE-20847.1.patch
>
>
> What got me looking at this class was the verboseness of some of the logging. 
>  I would like to request that we DEBUG the logging since this level of detail 
> means nothing to a cluster admin.
> Also... this {{contains}} call would be better applied onto a {{HashSet}} 
> instead of an {{ArrayList}}.
> {code:java|title=NullScanTaskDispatcher.java}
>   private void processAlias(MapWork work, Path path, ArrayList 
> aliasesAffected, ArrayList aliases) {
> // the aliases that are allowed to map to a null scan.
> ArrayList allowed = new ArrayList();
> for (String alias : aliasesAffected) {
>   if (aliases.contains(alias)) {
> allowed.add(alias);
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20890) ACID: Allow whole table ReadLocks to skip all partition locks

2018-11-07 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20890:
---
Issue Type: Improvement  (was: Bug)

> ACID: Allow whole table ReadLocks to skip all partition locks
> -
>
> Key: HIVE-20890
> URL: https://issues.apache.org/jira/browse/HIVE-20890
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Priority: Major
>
> HIVE-19369 proposes adding a EXCL_WRITE lock which does not wait for any 
> SHARED_READ locks for insert operations - in the presence of that lock, the 
> insert overwrite no longer takes an exclusive lock.
> The only exclusive operation will be a schema change or drop table, which 
> should take an exclusive lock on the entire table directly.
> {code}
> explain locks select * from tpcds_bin_partitioned_orc_1000.store_sales where 
> ss_sold_date_sk=2452626 
> ++
> |  Explain   |
> ++
> | LOCK INFORMATION:  |
> | tpcds_bin_partitioned_orc_1000.store_sales -> SHARED_READ |
> | tpcds_bin_partitioned_orc_1000.store_sales.ss_sold_date_sk=2452626 -> 
> SHARED_READ |
> ++
> {code}
> So the per-partition SHARED_READ locks are no longer necessary, if the lock 
> builder already includes the table-wide SHARED_READ locks.
> The removal of entire partitions is the only part which needs to be taken 
> care of within this semantics as row-removal instead of directory removal 
> (i.e "drop partition" -> "truncate partition" and have the truncation trigger 
> a whole directory cleaner, so that the partition disappears when there are 0 
> rows left).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20826) Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join

2018-11-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678931#comment-16678931
 ] 

Ashutosh Chauhan commented on HIVE-20826:
-

[~vgarg] Can you create RB for this?

> Enhance HiveSemiJoin rule to convert join + group by on left side to Left 
> Semi Join
> ---
>
> Key: HIVE-20826
> URL: https://issues.apache.org/jira/browse/HIVE-20826
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20826.1.patch, HIVE-20826.2.patch
>
>
> Currently HiveSemiJoin rule looks for pattern where group by is on right side.
> We can convert joins which have group by on left side (assuming group by keys 
> are same as join keys and none of the columns are being projected from left 
> side) to LEFT SEMI JOIN by swapping the inputs. e.g. queries such as:
> {code:sql}
> explain select pp.p_partkey from (select distinct p_name from part) p join 
> part pp on pp.p_name = p.p_name;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-07 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678930#comment-16678930
 ] 

Sergey Shelukhin commented on HIVE-20853:
-

+1 pending tests

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch, HIVE-20853.4.patch, HIVE-20853.5.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-20157) Do Not Print StackTraces to STDERR in ParseDriver

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR resolved HIVE-20157.

Resolution: Duplicate

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20157
> URL: https://issues.apache.org/jira/browse/HIVE-20157
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
>
> {{org/apache/hadoop/hive/ql/parse/ParseDriver.java}}
> {code}
> catch (RecognitionException e) {
>   e.printStackTrace();
>   throw new ParseException(parser.errors);
> }
> {code}
> Do not use {{e.printStackTrace()}} and print to STDERR.  Either remove or 
> replace with a debug-level log statement.  I would vote to simply remove.  
> There are several occurrences of this pattern in this class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20873) Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision

2018-11-07 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678844#comment-16678844
 ] 

slim bouguerra commented on HIVE-20873:
---

[~teddy.choi] Thanks, am not trying by any mean to waste your time, but it 
would be nice if you share what is the improvement you see how are you 
measuring it? and maybe also investigate if this will be a regression for other 
queries as well.
This will help me and others to learn form your experiments.

  

> Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
> 
>
> Key: HIVE-20873
> URL: https://issues.apache.org/jira/browse/HIVE-20873
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20873.1.patch, HIVE-20873.2.patch
>
>
> VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and 
> XOR operators for short computation time, but more hash collision. Group by 
> operations become very slow on large data sets. It needs Murmur hash or a 
> better hash function for less hash collision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20873) Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision

2018-11-07 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678852#comment-16678852
 ] 

Gopal V commented on HIVE-20873:


[~bslim]: Teddy & I have a UDF for the hash function, which we use to calculate 
skews.

I've merged Teddy's changes into it

https://github.com/t3rmin4t0r/long-hash-udf

{code}
select long2hash(i_item_sk, 1) & 255, count(1)  from item group by 
long2hash(i_item_sk, 1) & 255 order by count(1) desc ;

0   65536
2   65536
3   65536
1   65535
5   37857
{code}

So there's a bit-skew in the old hash function, instead of generating 256 
unique bit-patterns, but it skews the low-bits by the 2nd arg to the long2 hash.

{code}
select long2murmur(i_item_sk, 1) & 255, count(1)  from item group by 
long2murmur(i_item_sk, 1) & 255 order by count(1) desc ;

170 1274
37  1264
220 1254
110 1253
152 1241
5   1235
56  1232
179 1231
231 1228
168 1228
149 1228
84  1222
...
156 1082
Time taken: 1.727 seconds, Fetched: 256 row(s)
{code} 

> Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
> 
>
> Key: HIVE-20873
> URL: https://issues.apache.org/jira/browse/HIVE-20873
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20873.1.patch, HIVE-20873.2.patch
>
>
> VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and 
> XOR operators for short computation time, but more hash collision. Group by 
> operations become very slow on large data sets. It needs Murmur hash or a 
> better hash function for less hash collision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20797) Print Number of Locks Acquired

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20797:
---
Attachment: HIVE-20797.1.patch

> Print Number of Locks Acquired
> --
>
> Key: HIVE-20797
> URL: https://issues.apache.org/jira/browse/HIVE-20797
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Locking
>Affects Versions: 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20797.1.patch
>
>
> The number of locks acquired by a query can greatly influence the performance 
> and stability of the system, especially for ZK locks.  Please add INFO level 
> logging with the number of locks each query obtains.
> Log here:
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L1670-L1672
> {quote}
> A list of acquired locks will be stored in the 
> org.apache.hadoop.hive.ql.Context object and can be retrieved via 
> org.apache.hadoop.hive.ql.Context#getHiveLocks.
> {quote}
> https://github.com/apache/hive/blob/758ff449099065a84c46d63f9418201c8a6731b1/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java#L115-L127



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20873) Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision

2018-11-07 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678877#comment-16678877
 ] 

Gopal V commented on HIVE-20873:


LGTM - +1 tests pending.

TestHashCodeUtil.java needs ASF license.

> Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
> 
>
> Key: HIVE-20873
> URL: https://issues.apache.org/jira/browse/HIVE-20873
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20873.1.patch, HIVE-20873.2.patch
>
>
> VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and 
> XOR operators for short computation time, but more hash collision. Group by 
> operations become very slow on large data sets. It needs Murmur hash or a 
> better hash function for less hash collision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20797) Print Number of Locks Acquired

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20797:
--

Assignee: BELUGA BEHR

> Print Number of Locks Acquired
> --
>
> Key: HIVE-20797
> URL: https://issues.apache.org/jira/browse/HIVE-20797
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Locking
>Affects Versions: 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
>
> The number of locks acquired by a query can greatly influence the performance 
> and stability of the system, especially for ZK locks.  Please add INFO level 
> logging with the number of locks each query obtains.
> Log here:
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L1670-L1672
> {quote}
> A list of acquired locks will be stored in the 
> org.apache.hadoop.hive.ql.Context object and can be retrieved via 
> org.apache.hadoop.hive.ql.Context#getHiveLocks.
> {quote}
> https://github.com/apache/hive/blob/758ff449099065a84c46d63f9418201c8a6731b1/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java#L115-L127



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20797) Print Number of Locks Acquired

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20797:
---
Status: Patch Available  (was: Open)

> Print Number of Locks Acquired
> --
>
> Key: HIVE-20797
> URL: https://issues.apache.org/jira/browse/HIVE-20797
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Locking
>Affects Versions: 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20797.1.patch
>
>
> The number of locks acquired by a query can greatly influence the performance 
> and stability of the system, especially for ZK locks.  Please add INFO level 
> logging with the number of locks each query obtains.
> Log here:
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L1670-L1672
> {quote}
> A list of acquired locks will be stored in the 
> org.apache.hadoop.hive.ql.Context object and can be retrieved via 
> org.apache.hadoop.hive.ql.Context#getHiveLocks.
> {quote}
> https://github.com/apache/hive/blob/758ff449099065a84c46d63f9418201c8a6731b1/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java#L115-L127



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18415) Lower "Updating Partition Stats" Logging Level

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18415:
---
Attachment: HIVE-18415.1.patch

> Lower "Updating Partition Stats" Logging Level
> --
>
> Key: HIVE-18415
> URL: https://issues.apache.org/jira/browse/HIVE-18415
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.2, 2.2.0, 3.0.0, 2.3.2
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18415.1.patch
>
>
> {code:title=org.apache.hadoop.hive.metastore.utils.MetaStoreUtils}
> LOG.warn("Updating partition stats fast for: " + part.getTableName());
> ...
> LOG.warn("Updated size to " + params.get(StatsSetupConst.TOTAL_SIZE));
> {code}
> This logging produces many lines of WARN log messages in my log file and it's 
> not clear to me what the issue is here.  Why is this a warning and how should 
> I respond to address this warning?
> DEBUG is probably more appropriate for a utility class.  Please lower.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Attachment: HIVE-20833.7.patch

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch, 
> HIVE-20833.6.patch, HIVE-20833.7.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>

[jira] [Commented] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-07 Thread Jaume M (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678826#comment-16678826
 ] 

Jaume M commented on HIVE-20853:


Can you take another look [~sershe], I've updated the Jira and reviewboard.

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch, HIVE-20853.4.patch, HIVE-20853.5.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20831) Add Session ID to Operation Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20831:
--

Assignee: BELUGA BEHR  (was: Roohi Syeda)

> Add Session ID to Operation Logging
> ---
>
> Key: HIVE-20831
> URL: https://issues.apache.org/jira/browse/HIVE-20831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: newbie, noob
>
> {code:java|title=OperationManager.java}
> LOG.info("Adding operation: " + operation.getHandle());
> {code}
> Please add additional logging to explicitly state which Hive session this 
> operation is being added to.
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/service/src/java/org/apache/hive/service/cli/operation/OperationManager.java#L201



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20831) Add Session ID to Operation Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20831:
---
Attachment: HIVE-20831.1.patch

> Add Session ID to Operation Logging
> ---
>
> Key: HIVE-20831
> URL: https://issues.apache.org/jira/browse/HIVE-20831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: newbie, noob
> Attachments: HIVE-20831.1.patch
>
>
> {code:java|title=OperationManager.java}
> LOG.info("Adding operation: " + operation.getHandle());
> {code}
> Please add additional logging to explicitly state which Hive session this 
> operation is being added to.
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/service/src/java/org/apache/hive/service/cli/operation/OperationManager.java#L201



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18415) Lower "Updating Partition Stats" Logging Level

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18415:
---
Status: Patch Available  (was: Open)

> Lower "Updating Partition Stats" Logging Level
> --
>
> Key: HIVE-18415
> URL: https://issues.apache.org/jira/browse/HIVE-18415
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.3.2, 3.0.0, 2.2.0, 1.2.2
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18415.1.patch
>
>
> {code:title=org.apache.hadoop.hive.metastore.utils.MetaStoreUtils}
> LOG.warn("Updating partition stats fast for: " + part.getTableName());
> ...
> LOG.warn("Updated size to " + params.get(StatsSetupConst.TOTAL_SIZE));
> {code}
> This logging produces many lines of WARN log messages in my log file and it's 
> not clear to me what the issue is here.  Why is this a warning and how should 
> I respond to address this warning?
> DEBUG is probably more appropriate for a utility class.  Please lower.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18415) Lower "Updating Partition Stats" Logging Level

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-18415:
--

Assignee: BELUGA BEHR  (was: Peter Vary)

> Lower "Updating Partition Stats" Logging Level
> --
>
> Key: HIVE-18415
> URL: https://issues.apache.org/jira/browse/HIVE-18415
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.2, 2.2.0, 3.0.0, 2.3.2
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18415.1.patch
>
>
> {code:title=org.apache.hadoop.hive.metastore.utils.MetaStoreUtils}
> LOG.warn("Updating partition stats fast for: " + part.getTableName());
> ...
> LOG.warn("Updated size to " + params.get(StatsSetupConst.TOTAL_SIZE));
> {code}
> This logging produces many lines of WARN log messages in my log file and it's 
> not clear to me what the issue is here.  Why is this a warning and how should 
> I respond to address this warning?
> DEBUG is probably more appropriate for a utility class.  Please lower.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20826) Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20826:
---
Status: Open  (was: Patch Available)

> Enhance HiveSemiJoin rule to convert join + group by on left side to Left 
> Semi Join
> ---
>
> Key: HIVE-20826
> URL: https://issues.apache.org/jira/browse/HIVE-20826
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20826.1.patch, HIVE-20826.2.patch
>
>
> Currently HiveSemiJoin rule looks for pattern where group by is on right side.
> We can convert joins which have group by on left side (assuming group by keys 
> are same as join keys and none of the columns are being projected from left 
> side) to LEFT SEMI JOIN by swapping the inputs. e.g. queries such as:
> {code:sql}
> explain select pp.p_partkey from (select distinct p_name from part) p join 
> part pp on pp.p_name = p.p_name;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20873) Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678817#comment-16678817
 ] 

Hive QA commented on HIVE-20873:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
37s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} common: The patch generated 4 new + 6 unchanged - 0 
fixed = 10 total (was 6) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14797/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14797/yetus/diff-checkstyle-common.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14797/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14797/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
> 
>
> Key: HIVE-20873
> URL: https://issues.apache.org/jira/browse/HIVE-20873
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20873.1.patch, HIVE-20873.2.patch
>
>
> VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and 
> XOR operators for short computation time, but more hash collision. Group by 
> operations become very slow on large data sets. It needs Murmur hash or a 
> better hash function for less hash collision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20831) Add Session ID to Operation Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20831:
---
Status: Patch Available  (was: Open)

> Add Session ID to Operation Logging
> ---
>
> Key: HIVE-20831
> URL: https://issues.apache.org/jira/browse/HIVE-20831
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
>  Labels: newbie, noob
> Attachments: HIVE-20831.1.patch
>
>
> {code:java|title=OperationManager.java}
> LOG.info("Adding operation: " + operation.getHandle());
> {code}
> Please add additional logging to explicitly state which Hive session this 
> operation is being added to.
> https://github.com/apache/hive/blob/3963c729fabf90009cb67d277d40fe5913936358/service/src/java/org/apache/hive/service/cli/operation/OperationManager.java#L201



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Open  (was: Patch Available)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch, 
> HIVE-20833.6.patch, HIVE-20833.7.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>

[jira] [Updated] (HIVE-20826) Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20826:
---
Status: Patch Available  (was: Open)

> Enhance HiveSemiJoin rule to convert join + group by on left side to Left 
> Semi Join
> ---
>
> Key: HIVE-20826
> URL: https://issues.apache.org/jira/browse/HIVE-20826
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20826.1.patch, HIVE-20826.2.patch
>
>
> Currently HiveSemiJoin rule looks for pattern where group by is on right side.
> We can convert joins which have group by on left side (assuming group by keys 
> are same as join keys and none of the columns are being projected from left 
> side) to LEFT SEMI JOIN by swapping the inputs. e.g. queries such as:
> {code:sql}
> explain select pp.p_partkey from (select distinct p_name from part) p join 
> part pp on pp.p_name = p.p_name;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20826) Enhance HiveSemiJoin rule to convert join + group by on left side to Left Semi Join

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20826:
---
Attachment: HIVE-20826.2.patch

> Enhance HiveSemiJoin rule to convert join + group by on left side to Left 
> Semi Join
> ---
>
> Key: HIVE-20826
> URL: https://issues.apache.org/jira/browse/HIVE-20826
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20826.1.patch, HIVE-20826.2.patch
>
>
> Currently HiveSemiJoin rule looks for pattern where group by is on right side.
> We can convert joins which have group by on left side (assuming group by keys 
> are same as join keys and none of the columns are being projected from left 
> side) to LEFT SEMI JOIN by swapping the inputs. e.g. queries such as:
> {code:sql}
> explain select pp.p_partkey from (select distinct p_name from part) p join 
> part pp on pp.p_name = p.p_name;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16839) Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently

2018-11-07 Thread Guang Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678786#comment-16678786
 ] 

Guang Yang commented on HIVE-16839:
---

Hi [~vihangk1], updated the unit test per suggestion. Looks like the new run 
passed, could you help to commit the change?

 

Thanks for the help on this !

> Unbalanced calls to openTransaction/commitTransaction when alter the same 
> partition concurrently
> 
>
> Key: HIVE-16839
> URL: https://issues.apache.org/jira/browse/HIVE-16839
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 1.1.0
>Reporter: Nemon Lou
>Assignee: Guang Yang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16839.01.patch, HIVE-16839.02.patch, 
> HIVE-16839.03.patch
>
>
> SQL to reproduce:
> prepare:
> {noformat}
>  hdfs dfs -mkdir -p 
> /hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627
>  1,create external table tb_ltgsm_external (id int) PARTITIONED by (cp 
> string,ld string);
> {noformat}
> open one beeline run these two sql many times 
> {noformat} 2,ALTER TABLE tb_ltgsm_external ADD IF NOT EXISTS PARTITION 
> (cp=2017060513,ld=2017060610);
>  3,ALTER TABLE tb_ltgsm_external PARTITION (cp=2017060513,ld=2017060610) SET 
> LOCATION 
> 'hdfs://hacluster/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627';
> {noformat}
> open another beeline to run this sql many times at the same time.
> {noformat}
>  4,ALTER TABLE tb_ltgsm_external DROP PARTITION (cp=2017060513,ld=2017060610);
> {noformat}
> MetaStore logs:
> {noformat}
> 2017-06-06 21:58:34,213 | ERROR | pool-6-thread-197 | Retrying HMSHandler 
> after 2000 ms (attempt 1 of 10) with error: 
> javax.jdo.JDOObjectNotFoundException: No such database row
> FailedObject:49[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor
>   at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:475)
>   at 
> org.datanucleus.api.jdo.JDOAdapter.getApiExceptionForNucleusException(JDOAdapter.java:1158)
>   at 
> org.datanucleus.state.JDOStateManager.isLoaded(JDOStateManager.java:3231)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoGetcd(MStorageDescriptor.java)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.getCD(MStorageDescriptor.java:184)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1282)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1299)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:1680)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:1586)
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.getPartition(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:538)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions(HiveMetaStore.java:3317)
>   at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy12.alter_partitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9963)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9947)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> 

[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Patch Available  (was: Open)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch, 
> HIVE-20833.6.patch, HIVE-20833.7.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>

[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Open  (was: Patch Available)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch, 
> HIVE-20842.6.patch, HIVE-20842.7.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Attachment: HIVE-20842.7.patch

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch, 
> HIVE-20842.6.patch, HIVE-20842.7.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20842:
---
Status: Patch Available  (was: Open)

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch, 
> HIVE-20842.6.patch, HIVE-20842.7.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20160) Do Not Print StackTraces to STDERR in OperatorFactory

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20160:
---
Status: Patch Available  (was: Open)

> Do Not Print StackTraces to STDERR in OperatorFactory
> -
>
> Key: HIVE-20160
> URL: https://issues.apache.org/jira/browse/HIVE-20160
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20160.1.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java#L158
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(...
> {code}
> Do not print the stack trace.  The error is being wrapped in a HiveException. 
>  Allow the code catching this exception to print the error to a logger 
> instead of dumping it here to STDERR.  There are several instances of this in 
> the class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20160) Do Not Print StackTraces to STDERR in OperatorFactory

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20160:
---
Attachment: HIVE-20160.1.patch

> Do Not Print StackTraces to STDERR in OperatorFactory
> -
>
> Key: HIVE-20160
> URL: https://issues.apache.org/jira/browse/HIVE-20160
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20160.1.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java#L158
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(...
> {code}
> Do not print the stack trace.  The error is being wrapped in a HiveException. 
>  Allow the code catching this exception to print the error to a logger 
> instead of dumping it here to STDERR.  There are several instances of this in 
> the class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20161) Do Not Print StackTraces to STDERR in ParseDriver

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20161:
---
Attachment: HIVE-20161.1.patch

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20161
> URL: https://issues.apache.org/jira/browse/HIVE-20161
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20161.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
> {code}
> // Do not print stack trace to STDERR - remove this, just throw the 
> HiveException
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> ...
> // Do not log and throw.  log *or* throw.  In this case, just throw. Remove 
> logging.
> // Remove explicit 'return' call. No need for it.
>   try {
> skewJoinKeyContext.endGroup();
>   } catch (IOException e) {
> LOG.error(e.getMessage(), e);
> throw new HiveException(e);
>   }
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20160) Do Not Print StackTraces to STDERR in OperatorFactory

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20160:
--

Assignee: BELUGA BEHR

> Do Not Print StackTraces to STDERR in OperatorFactory
> -
>
> Key: HIVE-20160
> URL: https://issues.apache.org/jira/browse/HIVE-20160
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20160.1.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java#L158
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(...
> {code}
> Do not print the stack trace.  The error is being wrapped in a HiveException. 
>  Allow the code catching this exception to print the error to a logger 
> instead of dumping it here to STDERR.  There are several instances of this in 
> the class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20161) Do Not Print StackTraces to STDERR in ParseDriver

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20161:
---
Target Version/s: 4.0.0
  Status: Patch Available  (was: Open)

I also made all logging {{debug}} level to be consistent across the class amd 
changed logger name.

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20161
> URL: https://issues.apache.org/jira/browse/HIVE-20161
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20161.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
> {code}
> // Do not print stack trace to STDERR - remove this, just throw the 
> HiveException
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> ...
> // Do not log and throw.  log *or* throw.  In this case, just throw. Remove 
> logging.
> // Remove explicit 'return' call. No need for it.
>   try {
> skewJoinKeyContext.endGroup();
>   } catch (IOException e) {
> LOG.error(e.getMessage(), e);
> throw new HiveException(e);
>   }
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16839) Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678774#comment-16678774
 ] 

Hive QA commented on HIVE-16839:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947197/HIVE-16839.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15527 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14796/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14796/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14796/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947197 - PreCommit-HIVE-Build

> Unbalanced calls to openTransaction/commitTransaction when alter the same 
> partition concurrently
> 
>
> Key: HIVE-16839
> URL: https://issues.apache.org/jira/browse/HIVE-16839
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 1.1.0
>Reporter: Nemon Lou
>Assignee: Guang Yang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16839.01.patch, HIVE-16839.02.patch, 
> HIVE-16839.03.patch
>
>
> SQL to reproduce:
> prepare:
> {noformat}
>  hdfs dfs -mkdir -p 
> /hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627
>  1,create external table tb_ltgsm_external (id int) PARTITIONED by (cp 
> string,ld string);
> {noformat}
> open one beeline run these two sql many times 
> {noformat} 2,ALTER TABLE tb_ltgsm_external ADD IF NOT EXISTS PARTITION 
> (cp=2017060513,ld=2017060610);
>  3,ALTER TABLE tb_ltgsm_external PARTITION (cp=2017060513,ld=2017060610) SET 
> LOCATION 
> 'hdfs://hacluster/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627';
> {noformat}
> open another beeline to run this sql many times at the same time.
> {noformat}
>  4,ALTER TABLE tb_ltgsm_external DROP PARTITION (cp=2017060513,ld=2017060610);
> {noformat}
> MetaStore logs:
> {noformat}
> 2017-06-06 21:58:34,213 | ERROR | pool-6-thread-197 | Retrying HMSHandler 
> after 2000 ms (attempt 1 of 10) with error: 
> javax.jdo.JDOObjectNotFoundException: No such database row
> FailedObject:49[OID]org.apache.hadoop.hive.metastore.model.MStorageDescriptor
>   at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:475)
>   at 
> org.datanucleus.api.jdo.JDOAdapter.getApiExceptionForNucleusException(JDOAdapter.java:1158)
>   at 
> org.datanucleus.state.JDOStateManager.isLoaded(JDOStateManager.java:3231)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoGetcd(MStorageDescriptor.java)
>   at 
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.getCD(MStorageDescriptor.java:184)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1282)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1299)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:1680)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:1586)
>   at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98)
>   at com.sun.proxy.$Proxy0.getPartition(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:538)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions(HiveMetaStore.java:3317)
>   at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at com.sun.proxy.$Proxy12.alter_partitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions.getResult(ThriftHiveMetastore.java:9963)
>   at 
> 

[jira] [Assigned] (HIVE-20161) Do Not Print StackTraces to STDERR in ParseDriver

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20161:
--

Assignee: BELUGA BEHR

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20161
> URL: https://issues.apache.org/jira/browse/HIVE-20161
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
> {code}
> // Do not print stack trace to STDERR - remove this, just throw the 
> HiveException
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> ...
> // Do not log and throw.  log *or* throw.  In this case, just throw. Remove 
> logging.
> // Remove explicit 'return' call. No need for it.
>   try {
> skewJoinKeyContext.endGroup();
>   } catch (IOException e) {
> LOG.error(e.getMessage(), e);
> throw new HiveException(e);
>   }
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20223) SmallTableCache.java SLF4J Parameterized Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20223:
---
Status: Patch Available  (was: Open)

> SmallTableCache.java SLF4J Parameterized Logging
> 
>
> Key: HIVE-20223
> URL: https://issues.apache.org/jira/browse/HIVE-20223
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HIVE-20223.1.patch
>
>
> {code:java|title=org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java}
> if (LOG.isDebugEnabled()) {
> LOG.debug("Cleaned up small table cache for query " + queryId);
> }
> if (tableContainerMap.putIfAbsent(path, tableContainer) == null && 
> LOG.isDebugEnabled()) {
>   LOG.debug("Cached small table file " + path + " for query " + queryId);
> }
> if (tableContainer != null && LOG.isDebugEnabled()) {
>   LOG.debug("Loaded small table file " + path + " from cache for query " 
> + queryId);
> }
> {code}
>  
> Remove {{isDebugEnabled}} and replace with parameterized logging.
> https://www.slf4j.org/faq.html#logging_performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20223) SmallTableCache.java SLF4J Parameterized Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20223:
---
Attachment: HIVE-20223.1.patch

> SmallTableCache.java SLF4J Parameterized Logging
> 
>
> Key: HIVE-20223
> URL: https://issues.apache.org/jira/browse/HIVE-20223
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HIVE-20223.1.patch
>
>
> {code:java|title=org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java}
> if (LOG.isDebugEnabled()) {
> LOG.debug("Cleaned up small table cache for query " + queryId);
> }
> if (tableContainerMap.putIfAbsent(path, tableContainer) == null && 
> LOG.isDebugEnabled()) {
>   LOG.debug("Cached small table file " + path + " for query " + queryId);
> }
> if (tableContainer != null && LOG.isDebugEnabled()) {
>   LOG.debug("Loaded small table file " + path + " from cache for query " 
> + queryId);
> }
> {code}
>  
> Remove {{isDebugEnabled}} and replace with parameterized logging.
> https://www.slf4j.org/faq.html#logging_performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20223) SmallTableCache.java SLF4J Parameterized Logging

2018-11-07 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20223:
--

Assignee: BELUGA BEHR

> SmallTableCache.java SLF4J Parameterized Logging
> 
>
> Key: HIVE-20223
> URL: https://issues.apache.org/jira/browse/HIVE-20223
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HIVE-20223.1.patch
>
>
> {code:java|title=org/apache/hadoop/hive/ql/exec/spark/SmallTableCache.java}
> if (LOG.isDebugEnabled()) {
> LOG.debug("Cleaned up small table cache for query " + queryId);
> }
> if (tableContainerMap.putIfAbsent(path, tableContainer) == null && 
> LOG.isDebugEnabled()) {
>   LOG.debug("Cached small table file " + path + " for query " + queryId);
> }
> if (tableContainer != null && LOG.isDebugEnabled()) {
>   LOG.debug("Loaded small table file " + path + " from cache for query " 
> + queryId);
> }
> {code}
>  
> Remove {{isDebugEnabled}} and replace with parameterized logging.
> https://www.slf4j.org/faq.html#logging_performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16839) Unbalanced calls to openTransaction/commitTransaction when alter the same partition concurrently

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678727#comment-16678727
 ] 

Hive QA commented on HIVE-16839:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14796/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14796/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Unbalanced calls to openTransaction/commitTransaction when alter the same 
> partition concurrently
> 
>
> Key: HIVE-16839
> URL: https://issues.apache.org/jira/browse/HIVE-16839
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 1.1.0
>Reporter: Nemon Lou
>Assignee: Guang Yang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16839.01.patch, HIVE-16839.02.patch, 
> HIVE-16839.03.patch
>
>
> SQL to reproduce:
> prepare:
> {noformat}
>  hdfs dfs -mkdir -p 
> /hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627
>  1,create external table tb_ltgsm_external (id int) PARTITIONED by (cp 
> string,ld string);
> {noformat}
> open one beeline run these two sql many times 
> {noformat} 2,ALTER TABLE tb_ltgsm_external ADD IF NOT EXISTS PARTITION 
> (cp=2017060513,ld=2017060610);
>  3,ALTER TABLE tb_ltgsm_external PARTITION (cp=2017060513,ld=2017060610) SET 
> LOCATION 
> 'hdfs://hacluster/hzsrc/external/writing_dc/ltgsm/16e7a9b2-21a1-3f4f-8061-bc3395281627';
> {noformat}
> open another beeline to run this sql many times at the same time.
> {noformat}
>  4,ALTER TABLE tb_ltgsm_external DROP PARTITION (cp=2017060513,ld=2017060610);
> {noformat}
> MetaStore logs:
> {noformat}
> 2017-06-06 21:58:34,213 | ERROR | pool-6-thread-197 | Retrying HMSHandler 
> after 2000 ms (attempt 1 of 10) with error: 
> javax.jdo.JDOObjectNotFoundException: No such 

[jira] [Comment Edited] (HIVE-8131) Support timestamp in Avro

2018-11-07 Thread vinisha (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678651#comment-16678651
 ] 

vinisha edited comment on HIVE-8131 at 11/7/18 7:50 PM:


This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

One possibility is to support avro timestamp-millis and avro timestamp-micros 
in serialization. Avro Deserializer can map hive timestamp to timestamp-micros. 

What do you think [~brocknoland] [~xu] [~leftylev]

 


was (Author: vinisha):
This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

One possibility is to support avro timestamp-millis and avro timestamp-micros 
in serialization. Avro Deserializer can map hive timestamp to timestamp-micros. 

What do you think [~brocknoland] [~xu]

 

> Support timestamp in Avro
> -
>
> Key: HIVE-8131
> URL: https://issues.apache.org/jira/browse/HIVE-8131
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-8131.1.patch, HIVE-8131.patch, HIVE-8131.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20823) Make Compactor run in a transaction

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678703#comment-16678703
 ] 

Hive QA commented on HIVE-20823:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947174/HIVE-20823.04.patch

{color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 55 failed/errored test(s), 15520 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestTxnCommands3.testCleaner2 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommandsForMmTable.testSnapshotIsolationWithAbortedTxnOnMmTable
 (batchId=283)
org.apache.hadoop.hive.ql.TestTxnCommandsForOrcMmTable.testSnapshotIsolationWithAbortedTxnOnMmTable
 (batchId=305)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidBootstrapReplLoadRetryAfterFailure
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesBootstrap
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesBootstrapWithConcurrentWrites
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesBootstrapWithOpenTxnsTimeout
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesMoveOptimizationBootStrap
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables.testAcidBootstrapReplLoadRetryAfterFailure
 (batchId=247)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables.testAcidTablesBootstrap
 (batchId=247)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables.testAcidTablesBootstrapWithConcurrentWrites
 (batchId=247)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables.testAcidTablesBootstrapWithOpenTxnsTimeout
 (batchId=247)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables.testAcidTablesMoveOptimizationBootStrap
 (batchId=247)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningDelete[0]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningDelete[1]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningInsert[0]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningInsert[1]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningUpdate[0]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningUpdate[1]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningInsert[0]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningInsert[1]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate[0]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate[1]
 (batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testTableProperties[0] 
(batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testTableProperties[1] 
(batchId=243)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.chooseMajorOverMinorWhenBothValid
 (batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactPartitionHighDeltaPct
 (batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactPartitionTooManyDeltas
 (batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactTableHighDeltaPct 
(batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactTableTooManyDeltas 
(batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.enoughDeltasNoBase 
(batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.noCompactTableDeltaPctNotHighEnough
 (batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.noCompactTableNotEnoughDeltas
 (batchId=293)
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.twoTxnsOnSamePartitionGenerateOneCompactionRequest
 (batchId=293)
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testMultipleTransactionBatchCommits
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort 
(batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbortAndCommit
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_DelimitedUGI
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
 (batchId=217)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Regex
 (batchId=217)

[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Attachment: HIVE-20804.7.patch

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch, HIVE-20804.5.patch, 
> HIVE-20804.6.patch, HIVE-20804.7.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Status: Open  (was: Patch Available)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch, HIVE-20804.5.patch, 
> HIVE-20804.6.patch, HIVE-20804.7.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-07 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678684#comment-16678684
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20512:
-

AwaitTermination causes delay by blocking the main thread. Since we actually 
don't care about whether loggerThread is executed during close, avoiding usage 
of awaitTermination by canceling the scheduledFuture after shutDown as 
suggested by Sahil.
Attaching HIVE-20512.8.patch with this change to see if the tests pass.

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20804:
---
Status: Patch Available  (was: Open)

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch, HIVE-20804.5.patch, 
> HIVE-20804.6.patch, HIVE-20804.7.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-8131) Support timestamp in Avro

2018-11-07 Thread vinisha (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678651#comment-16678651
 ] 

vinisha edited comment on HIVE-8131 at 11/7/18 7:38 PM:


This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

One possibility is to support avro timestamp-millis and avro timestamp-micros 
in serialization. Avro Deserializer can map hive timestamp to timestamp-micros. 

What do you think [~brocknoland] [~xu]

 


was (Author: vinisha):
This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

One possibility is to support avro timestamp-millis and avro timestamp-micros 
in serialization. Avro Deserializer can map hive timestamp to timestamp-micros. 

 

> Support timestamp in Avro
> -
>
> Key: HIVE-8131
> URL: https://issues.apache.org/jira/browse/HIVE-8131
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-8131.1.patch, HIVE-8131.patch, HIVE-8131.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-8131) Support timestamp in Avro

2018-11-07 Thread vinisha (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678651#comment-16678651
 ] 

vinisha edited comment on HIVE-8131 at 11/7/18 7:37 PM:


This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

One possibility is to support avro timestamp-millis and avro timestamp-micros 
in serialization. Avro Deserializer can map hive timestamp to timestamp-micros. 

 


was (Author: vinisha):
This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

> Support timestamp in Avro
> -
>
> Key: HIVE-8131
> URL: https://issues.apache.org/jira/browse/HIVE-8131
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-8131.1.patch, HIVE-8131.patch, HIVE-8131.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20512) Improve record and memory usage logging in SparkRecordHandler

2018-11-07 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20512:

Attachment: HIVE-20512.8.patch

> Improve record and memory usage logging in SparkRecordHandler
> -
>
> Key: HIVE-20512
> URL: https://issues.apache.org/jira/browse/HIVE-20512
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20512.1.patch, HIVE-20512.2.patch, 
> HIVE-20512.3.patch, HIVE-20512.4.patch, HIVE-20512.5.patch, 
> HIVE-20512.6.patch, HIVE-20512.7.patch, HIVE-20512.8.patch
>
>
> We currently log memory usage and # of records processed in Spark tasks, but 
> we should improve the methodology for how frequently we log this info. 
> Currently we use the following code:
> {code:java}
> private long getNextLogThreshold(long currentThreshold) {
> // A very simple counter to keep track of number of rows processed by the
> // reducer. It dumps
> // every 1 million times, and quickly before that
> if (currentThreshold >= 100) {
>   return currentThreshold + 100;
> }
> return 10 * currentThreshold;
>   }
> {code}
> The issue is that after a while, the increase by 10x factor means that you 
> have to process a huge # of records before this gets triggered.
> A better approach would be to log this info at a given interval. This would 
> help in debugging tasks that are seemingly hung.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20823) Make Compactor run in a transaction

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678670#comment-16678670
 ] 

Hive QA commented on HIVE-20823:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
15s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
5s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
49s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} storage-api: The patch generated 0 new + 7 unchanged 
- 1 fixed = 7 total (was 8) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
50s{color} | {color:red} ql: The patch generated 42 new + 1633 unchanged - 13 
fixed = 1675 total (was 1646) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
50s{color} | {color:red} ql generated 1 new + 2314 unchanged - 1 fixed = 2315 
total (was 2315) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
12s{color} | {color:red} standalone-metastore/metastore-server generated 1 new 
+ 184 unchanged - 1 fixed = 185 total (was 185) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Exception is caught when Exception is not thrown in 
org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(CompactionInfo, long)  At 
Cleaner.java:is not thrown in 
org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(CompactionInfo, long)  At 
Cleaner.java:[line 210] |
| FindBugs | module:standalone-metastore/metastore-server |
|  |  
org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.findMinOpenTxnGLB(Statement)
 may fail to clean up java.sql.ResultSet  Obligation to clean up resource 
created at CompactionTxnHandler.java:up java.sql.ResultSet  Obligation to clean 
up resource created at CompactionTxnHandler.java:[line 356] is not 

[jira] [Updated] (HIVE-20838) Timestamps with timezone are set to null when using the streaming API

2018-11-07 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20838:
---
Attachment: HIVE-20838.8.patch
Status: Patch Available  (was: Open)

> Timestamps with timezone are set to null when using the streaming API
> -
>
> Key: HIVE-20838
> URL: https://issues.apache.org/jira/browse/HIVE-20838
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20838.1.patch, HIVE-20838.2.patch, 
> HIVE-20838.3.patch, HIVE-20838.3.patch, HIVE-20838.4.patch, 
> HIVE-20838.5.patch, HIVE-20838.6.patch, HIVE-20838.7.patch, HIVE-20838.8.patch
>
>
> For example:
> {code}
> beeline> create table default.timest (a TIMESTAMP) stored as orc " +
> "TBLPROPERTIES('transactional'='true')
> # And then:
> connection.write("2018-10-19 10:35:00 America/Los_Angeles".getBytes());
> {code}
> inserts NULL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20838) Timestamps with timezone are set to null when using the streaming API

2018-11-07 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20838:
---
Status: Open  (was: Patch Available)

> Timestamps with timezone are set to null when using the streaming API
> -
>
> Key: HIVE-20838
> URL: https://issues.apache.org/jira/browse/HIVE-20838
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20838.1.patch, HIVE-20838.2.patch, 
> HIVE-20838.3.patch, HIVE-20838.3.patch, HIVE-20838.4.patch, 
> HIVE-20838.5.patch, HIVE-20838.6.patch, HIVE-20838.7.patch, HIVE-20838.8.patch
>
>
> For example:
> {code}
> beeline> create table default.timest (a TIMESTAMP) stored as orc " +
> "TBLPROPERTIES('transactional'='true')
> # And then:
> connection.write("2018-10-19 10:35:00 America/Los_Angeles".getBytes());
> {code}
> inserts NULL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-8131) Support timestamp in Avro

2018-11-07 Thread vinisha (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678651#comment-16678651
 ] 

vinisha commented on HIVE-8131:
---

This change only supports timestamp-millis. Avro 1.8.2 also supports 
timestamp-micros. 
[https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]

timestamp-micros should also be supported in hive AvroSerde because hive 
timestamps support nano second level precision.

[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]

> Support timestamp in Avro
> -
>
> Key: HIVE-8131
> URL: https://issues.apache.org/jira/browse/HIVE-8131
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
>Priority: Major
>  Labels: TODOC15
> Fix For: 1.1.0
>
> Attachments: HIVE-8131.1.patch, HIVE-8131.patch, HIVE-8131.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20868) SMB Join fails intermittently when TezDummyOperator has child op in getFinalOp in MapRecordProcessor

2018-11-07 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20868:
---
Description: 
In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
this, the fetchDone remains set to true for the DummyOp which was set by 
previous task. Ideally, fetchDone should be reset for each task. This 
eventually leads to the join op skip rows from that dummy op resulting in wrong 
results.

Good init order

{code}
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = TS[3] (core)
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = FIL[24]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = SEL[5]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
child Ops = DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: Iterating 
children of dummy op DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: getFinalOp 
returns DUMMY_STORE[45]
2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
InitProcessor : setting fetchDone to false
{code}

Bad init order 

{code}
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = TS[3] (core)
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = FIL[24]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = SEL[5]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = DUMMY_STORE[45]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Iterating 
children of dummy op DUMMY_STORE[45]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  Child of 
Dummy Op MERGEJOIN[44]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = MERGEJOIN[44]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = SEL[13]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
child Ops = RS[14]
2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  getFinalOp 
returns RS[14]
{code}

  was:In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
this, the fetchDone remains set to true for the DummyOp which was set by 
previous task. Ideally, fetchDone should be reset for each task. This 
eventually leads to the join op skip rows from that dummy op resulting in wrong 
results.


> SMB Join fails intermittently when TezDummyOperator has child op in 
> getFinalOp in MapRecordProcessor
> 
>
> Key: HIVE-20868
> URL: https://issues.apache.org/jira/browse/HIVE-20868
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20868.1.patch
>
>
> In MapRecordProcessor::getFinalOp() due to external cause(not known), the 
> TezDummyStoreOperator may have MergeJoin Op as child intermittently. Due to 
> this, the fetchDone remains set to true for the DummyOp which was set by 
> previous task. Ideally, fetchDone should be reset for each task. This 
> eventually leads to the join op skip rows from that dummy op resulting in 
> wrong results.
> Good init order
> {code}
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = TS[3] (core)
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = FIL[24]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = SEL[5]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp child Ops = DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: Iterating 
> children of dummy op DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> getFinalOp returns DUMMY_STORE[45]
> 2018-11-01 21:42:33,677 [INFO] [TezChild] |tez.MapRecordProcessor|: 
> InitProcessor : setting fetchDone to false
> {code}
> Bad init order 
> {code}
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = TS[3] (core)
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = FIL[24]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp child Ops = SEL[5]
> 2018-11-01 21:42:33,304 [INFO] [TezChild] |tez.MapRecordProcessor|:  
> getFinalOp 

[jira] [Commented] (HIVE-20804) Further improvements to group by optimization with constraints

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678618#comment-16678618
 ] 

Hive QA commented on HIVE-20804:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947161/HIVE-20804.6.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14794/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14794/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14794/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12947161/HIVE-20804.6.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947161 - PreCommit-HIVE-Build

> Further improvements to group by optimization with constraints
> --
>
> Key: HIVE-20804
> URL: https://issues.apache.org/jira/browse/HIVE-20804
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20804.1.patch, HIVE-20804.2.patch, 
> HIVE-20804.3.patch, HIVE-20804.4.patch, HIVE-20804.5.patch, HIVE-20804.6.patch
>
>
> Continuation of HIVE-17043



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678616#comment-16678616
 ] 

Hive QA commented on HIVE-20740:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947160/HIVE-20740.10.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15525 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14793/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14793/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14793/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947160 - PreCommit-HIVE-Build

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent 

[jira] [Commented] (HIVE-20838) Timestamps with timezone are set to null when using the streaming API

2018-11-07 Thread Jaume M (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678615#comment-16678615
 ] 

Jaume M commented on HIVE-20838:


Subtle bug: {{catch (IllegalArgumentException | DateTimeParseException eTZ)}} 
should be {{catch (IllegalArgumentException | DateTimeException eTZ)}}. Pretty 
nice that the tests caught it.

> Timestamps with timezone are set to null when using the streaming API
> -
>
> Key: HIVE-20838
> URL: https://issues.apache.org/jira/browse/HIVE-20838
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20838.1.patch, HIVE-20838.2.patch, 
> HIVE-20838.3.patch, HIVE-20838.3.patch, HIVE-20838.4.patch, 
> HIVE-20838.5.patch, HIVE-20838.6.patch, HIVE-20838.7.patch
>
>
> For example:
> {code}
> beeline> create table default.timest (a TIMESTAMP) stored as orc " +
> "TBLPROPERTIES('transactional'='true')
> # And then:
> connection.write("2018-10-19 10:35:00 America/Los_Angeles".getBytes());
> {code}
> inserts NULL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678612#comment-16678612
 ] 

Ashutosh Chauhan commented on HIVE-20842:
-

+1

> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-07 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Attachment: HIVE-20853.5.patch
Status: Patch Available  (was: Open)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch, HIVE-20853.4.patch, HIVE-20853.5.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678580#comment-16678580
 ] 

Hive QA commented on HIVE-20740:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
39s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
4s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
35s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
31s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 608 
unchanged - 0 fixed = 609 total (was 608) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
9s{color} | {color:red} standalone-metastore/metastore-server generated 1 new + 
183 unchanged - 2 fixed = 184 total (was 185) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  
org.apache.hadoop.hive.metastore.PersistenceManagerProvider.updatePmfProperties(Configuration)
 does not release lock on all paths  At PersistenceManagerProvider.java:on all 
paths  At PersistenceManagerProvider.java:[line 152] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14793/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14793/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14793/yetus/patch-mvninstall-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14793/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14793/yetus/new-findbugs-standalone-metastore_metastore-server.html
 |
| modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14793/yetus.txt |
| Powered by | Apache Yetus

[jira] [Commented] (HIVE-20862) QueryId no longer shows up in the logs

2018-11-07 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678581#comment-16678581
 ] 

Vaibhav Gumashta commented on HIVE-20862:
-

+1

> QueryId no longer shows up in the logs
> --
>
> Key: HIVE-20862
> URL: https://issues.apache.org/jira/browse/HIVE-20862
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20862.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20813) udf to_epoch_milli need to support timestamp without time zone as well

2018-11-07 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20813:
--
Attachment: HIVE-20813.patch

> udf to_epoch_milli need to support timestamp without time zone as well
> --
>
> Key: HIVE-20813
> URL: https://issues.apache.org/jira/browse/HIVE-20813
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20813.patch, HIVE-20813.patch, HIVE-20813.patch, 
> HIVE-20813.patch
>
>
> Currently the following query will fail with a cast exception (tries to cast 
> timestamp to timestamp with local timezone).
> {code}
>  select to_epoch_milli(current_timestamp)
> {code}
> As a simple fix we need to add support for timestamp object inspector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20884) Bootstrap of tables to target with hive.strict.managed.tables enabled.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20884:

Summary: Bootstrap of tables to target with hive.strict.managed.tables 
enabled.  (was: Support bootstrap of tables to target with 
hive.strict.managed.tables enabled.)

> Bootstrap of tables to target with hive.strict.managed.tables enabled.
> --
>
> Key: HIVE-20884
> URL: https://issues.apache.org/jira/browse/HIVE-20884
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR
>
> Hive2 supports replication of managed tables. But in Hive3, some of these 
> managed tables are converted to ACID or MM tables. Also, some of them are 
> converted to external tables based on below rules. 
>  # Avro format, Storage handlers, List bucketed tabled are converted to 
> external tables.
>  # Location not owned by "hive" user are converted to external table.
>  # Hive owned ORC format are converted to full ACID transactional table.
>  # Hive owned Non-ORC format are converted to MM transactional table.
> REPL LOAD should apply these rules during bootstrap and convert the tables 
> accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Patch Available  (was: Open)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch, HIVE-20833.6.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>`program_update_date` 

[jira] [Updated] (HIVE-20833) package.jdo needs to be updated to conform with HIVE-20221 changes

2018-11-07 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20833:
---
Status: Open  (was: Patch Available)

> package.jdo needs to be updated to conform with HIVE-20221 changes
> --
>
> Key: HIVE-20833
> URL: https://issues.apache.org/jira/browse/HIVE-20833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20833.1.patch, HIVE-20833.2.patch, 
> HIVE-20833.3.patch, HIVE-20833.4.patch, HIVE-20833.5.patch, HIVE-20833.6.patch
>
>
> Following test if run with TestMiniLlapLocalCliDriver will fail:
> {code:sql}
> CREATE TABLE `alterPartTbl`(
>`po_header_id` bigint,
>`vendor_num` string,
>`requester_name` string,
>`approver_name` string,
>`buyer_name` string,
>`preparer_name` string,
>`po_requisition_number` string,
>`po_requisition_id` bigint,
>`po_requisition_desc` string,
>`rate_type` string,
>`rate_date` date,
>`rate` double,
>`blanket_total_amount` double,
>`authorization_status` string,
>`revision_num` bigint,
>`revised_date` date,
>`approved_flag` string,
>`approved_date` timestamp,
>`amount_limit` double,
>`note_to_authorizer` string,
>`note_to_vendor` string,
>`note_to_receiver` string,
>`vendor_order_num` string,
>`comments` string,
>`acceptance_required_flag` string,
>`acceptance_due_date` date,
>`closed_date` timestamp,
>`user_hold_flag` string,
>`approval_required_flag` string,
>`cancel_flag` string,
>`firm_status_lookup_code` string,
>`firm_date` date,
>`frozen_flag` string,
>`closed_code` string,
>`org_id` bigint,
>`reference_num` string,
>`wf_item_type` string,
>`wf_item_key` string,
>`submit_date` date,
>`sap_company_code` string,
>`sap_fiscal_year` bigint,
>`po_number` string,
>`sap_line_item` bigint,
>`closed_status_flag` string,
>`balancing_segment` string,
>`cost_center_segment` string,
>`base_amount_limit` double,
>`base_blanket_total_amount` double,
>`base_open_amount` double,
>`base_ordered_amount` double,
>`cancel_date` timestamp,
>`cbc_accounting_date` date,
>`change_requested_by` string,
>`change_summary` string,
>`confirming_order_flag` string,
>`document_creation_method` string,
>`edi_processed_flag` string,
>`edi_processed_status` string,
>`enabled_flag` string,
>`encumbrance_required_flag` string,
>`end_date` date,
>`end_date_active` date,
>`from_header_id` bigint,
>`from_type_lookup_code` string,
>`global_agreement_flag` string,
>`government_context` string,
>`interface_source_code` string,
>`ledger_currency_code` string,
>`open_amount` double,
>`ordered_amount` double,
>`pay_on_code` string,
>`payment_term_name` string,
>`pending_signature_flag` string,
>`po_revision_num` double,
>`preparer_id` bigint,
>`price_update_tolerance` double,
>`print_count` double,
>`printed_date` date,
>`reply_date` date,
>`reply_method_lookup_code` string,
>`rfq_close_date` date,
>`segment2` string,
>`segment3` string,
>`segment4` string,
>`segment5` string,
>`shipping_control` string,
>`start_date` date,
>`start_date_active` date,
>`summary_flag` string,
>`supply_agreement_flag` string,
>`usd_amount_limit` double,
>`usd_blanket_total_amount` double,
>`usd_exchange_rate` double,
>`usd_open_amount` double,
>`usd_order_amount` double,
>`ussgl_transaction_code` string,
>`xml_flag` string,
>`purchasing_organization_id` bigint,
>`purchasing_group_code` string,
>`last_updated_by_name` string,
>`created_by_name` string,
>`incoterms_1` string,
>`incoterms_2` string,
>`ame_approval_id` double,
>`ame_transaction_type` string,
>`auto_sourcing_flag` string,
>`cat_admin_auth_enabled_flag` string,
>`clm_document_number` string,
>`comm_rev_num` double,
>`consigned_consumption_flag` string,
>`consume_req_demand_flag` string,
>`conterms_articles_upd_date` timestamp,
>`conterms_deliv_upd_date` timestamp,
>`conterms_exist_flag` string,
>`cpa_reference` double,
>`created_language` string,
>`email_address` string,
>`enable_all_sites` string,
>`fax` string,
>`lock_owner_role` string,
>`lock_owner_user_id` double,
>`min_release_amount` double,
>`mrc_rate` string,
>`mrc_rate_date` string,
>`mrc_rate_type` string,
>`otm_recovery_flag` string,
>`otm_status_code` string,
>`pay_when_paid` string,
>`pcard_id` bigint,
>`program_update_date` 

[jira] [Assigned] (HIVE-20884) Support bootstrap of tables to target with hive.strict.managed.tables enabled.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-20884:
---


> Support bootstrap of tables to target with hive.strict.managed.tables enabled.
> --
>
> Key: HIVE-20884
> URL: https://issues.apache.org/jira/browse/HIVE-20884
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR
>
> Hive2 supports replication of managed tables. But in Hive3, some of these 
> managed tables are converted to ACID or MM tables. Also, some of them are 
> converted to external tables based on below rules. 
>  # Avro format, Storage handlers, List bucketed tabled are converted to 
> external tables.
>  # Location not owned by "hive" user are converted to external table.
>  # Hive owned ORC format are converted to full ACID transactional table.
>  # Hive owned Non-ORC format are converted to MM transactional table.
> REPL LOAD should apply these rules during bootstrap and convert the tables 
> accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20883) REPL DUMP to dump the default warehouse directory of source.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-20883:
---


> REPL DUMP to dump the default warehouse directory of source.
> 
>
> Key: HIVE-20883
> URL: https://issues.apache.org/jira/browse/HIVE-20883
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Minor
>  Labels: DR
>
> The default warehouse directory of the source is needed by target to detect 
> if DB or table location is set by user or assigned by Hive. 
> Using this information, REPL LOAD will decide to preserve the path or move 
> data to default managed table's warehouse directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Attachment: HIVE-20682.06.patch

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20682) Async query execution can potentially fail if shared sessionHive is closed by master thread.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:

Status: Patch Available  (was: Open)

06.patch fixed test failure and windbags issue.

> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> 
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch, HIVE-20682.05.patch, 
> HIVE-20682.06.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20882) Support Hive replication to a target cluster with hive.strict.managed.tables enabled.

2018-11-07 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-20882:
---


> Support Hive replication to a target cluster with hive.strict.managed.tables 
> enabled.
> -
>
> Key: HIVE-20882
> URL: https://issues.apache.org/jira/browse/HIVE-20882
> Project: Hive
>  Issue Type: New Feature
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR
>
> *Requirements:*
>  - Support Hive replication with Hive2 as master and Hive3 as slave where 
> hive.strict.managed.tables is enabled.
>  - The non-ACID managed tables from Hive2 should be converted to appropriate 
> ACID or MM tables or to an external table based on Hive3 table type rules.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678530#comment-16678530
 ] 

Hive QA commented on HIVE-20842:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947155/HIVE-20842.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=85)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[except_distinct] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_all] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_distinct]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_merge] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parallel_colstats]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainanalyze_2]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union2] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_multiinsert]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[unionDistinct_3]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets2]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=190)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query38] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query49] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query58] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query5] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query75] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query77] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query80] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query83] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query87] 
(batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query14]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query38]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query45]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query49]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query58]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query64]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query70]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query75]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query80]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query83]
 (batchId=272)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query87]
 (batchId=272)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14792/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14792/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 40 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 

[jira] [Updated] (HIVE-20853) Expose ShuffleHandler.registerDag in the llap daemon API

2018-11-07 Thread Jaume M (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20853:
---
Status: Open  (was: Patch Available)

> Expose ShuffleHandler.registerDag in the llap daemon API
> 
>
> Key: HIVE-20853
> URL: https://issues.apache.org/jira/browse/HIVE-20853
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
> Attachments: HIVE-20853.1.patch, HIVE-20853.2.patch, 
> HIVE-20853.3.patch, HIVE-20853.4.patch, HIVE-20853.5.patch
>
>
> Currently DAGs are only registered when a submitWork is called for that DAG. 
> At this point the crendentials are added to the ShuffleHandler and it can 
> start serving.
> However Tez might (and will) schedule tasks to fetch from the ShuffleHandler 
> before anything of this happens and all this tasks will fail which may 
> results in the query failing.
> This happens in the scenario in which a LlapDaemon just comes up and tez 
> fetchers try to open a connection before a DAG has been registered.
> Adding this API will allow to register the DAG against the Daemon when the AM 
> notices that a new Daemon is up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-20825) Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in length) causing the "Not a valid ORC file" error

2018-11-07 Thread Tom Zeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678507#comment-16678507
 ] 

Tom Zeng edited comment on HIVE-20825 at 11/7/18 5:01 PM:
--

[~ekoifman] can you try this on Hive 2.3.4? Looks like this is broken on 2.3.x. 
I haven't been able to get Hive 3+ working yet, will update here once I am able 
to run Hive 3.


was (Author: tomzeng):
[~ekoifman] can you try this on Hive 2.3.4? Looks like this is broken on 2.3.x. 
I haven't been able to get Hive 3+ working yet, will update here once I am able 
to run Hive 3_.

> Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in 
> length) causing the "Not a valid ORC file" error
> --
>
> Key: HIVE-20825
> URL: https://issues.apache.org/jira/browse/HIVE-20825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.3.1, 2.3.2
> Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
> build of Hive 2.3.4
>Reporter: Tom Zeng
>Priority: Major
> Attachments: hive-merge-invalid-orc-repro.hql, 
> hive-merge-invalid-orc-repro.log
>
>
> When using Hive ACID Merge (supported with the ORC format) to update/insert 
> data, bucket files with 0 byte or 3 bytes (file content contains three 
> characters: ORC) are generated during MERGE INTO operations which finish with 
> no errors. Subsequent queries on the base table will get "Not a valid ORC 
> file" error.
>  
> The following script can be used to reproduce the issue(note that with small 
> amount of data like this increasing the number of buckets could result in 
> query working, but with large data set it will fail no matter what bucket 
> size):
> set hive.auto.convert.join=false;
>  set hive.enforce.bucketing=true;
>  set hive.exec.dynamic.partition.mode = nonstrict;
>  set hive.support.concurrency=true;
>  set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> drop table if exists mergedelta_txt_1;
>  drop table if exists mergedelta_txt_2;
> CREATE TABLE mergedelta_txt_1 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> CREATE TABLE mergedelta_txt_2 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> INSERT INTO TABLE mergedelta_txt_1
>  partition(date_key=20170103)
>  VALUES
>  ("AB94LIENR0",46700,12345676836978),
>  ("AB94LIENR1",46825,12345676836978),
>  ("AB94LIENS0",46709,12345676836978),
>  ("AB94LIENS1",46834,12345676836978),
>  ("AB94LIENT0",46709,12345676836978),
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982);
> INSERT INTO TABLE mergedelta_txt_2
>  partition(date_key=20170103)
>  VALUES 
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982),
>  ("AB94LIERA2",47418,12345676836982),
>  ("AB94LIERB0",47176,12345676836982),
>  ("AB94LIERB1",47302,12345676836982),
>  ("AB94LIERB2",47418,12345676836982),
>  ("AB94LIERC0",47185,12345676836982);
> DROP TABLE IF EXISTS mergebase_1;
>  CREATE TABLE mergebase_1 (
>  id_str varchar(12) , time_key int , value bigint)
>  PARTITIONED BY (date_key int)
>  CLUSTERED BY (id_str,time_key) INTO 4 BUCKETS
>  STORED AS ORC
>  TBLPROPERTIES (
>  'orc.compress'='SNAPPY',
>  'pk_columns'='id_str,date_key,time_key',
>  'NO_AUTO_COMPACTION'='true',
>  'transactional'='true');
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, date_key, rank() OVER (PARTITION BY 
> id_str,date_key,time_key ORDER BY id_str,date_key,time_key) AS rk 
>  FROM mergedelta_txt_1
>  

[jira] [Comment Edited] (HIVE-20825) Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in length) causing the "Not a valid ORC file" error

2018-11-07 Thread Tom Zeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678507#comment-16678507
 ] 

Tom Zeng edited comment on HIVE-20825 at 11/7/18 5:02 PM:
--

[~ekoifman] can you try this on Hive 2.3.4? Looks like this is broken on 2.3.x. 
I haven't been able to get Hive 3+ working yet, will update here once I am able 
to run Hive 3.  Hive Merge works well on Hive 2.2.1.


was (Author: tomzeng):
[~ekoifman] can you try this on Hive 2.3.4? Looks like this is broken on 2.3.x. 
I haven't been able to get Hive 3+ working yet, will update here once I am able 
to run Hive 3.

> Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in 
> length) causing the "Not a valid ORC file" error
> --
>
> Key: HIVE-20825
> URL: https://issues.apache.org/jira/browse/HIVE-20825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.3.1, 2.3.2, 2.3.4
> Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
> build of Hive 2.3.4
>Reporter: Tom Zeng
>Priority: Major
> Attachments: hive-merge-invalid-orc-repro.hql, 
> hive-merge-invalid-orc-repro.log
>
>
> When using Hive ACID Merge (supported with the ORC format) to update/insert 
> data, bucket files with 0 byte or 3 bytes (file content contains three 
> characters: ORC) are generated during MERGE INTO operations which finish with 
> no errors. Subsequent queries on the base table will get "Not a valid ORC 
> file" error.
>  
> The following script can be used to reproduce the issue(note that with small 
> amount of data like this increasing the number of buckets could result in 
> query working, but with large data set it will fail no matter what bucket 
> size):
> set hive.auto.convert.join=false;
>  set hive.enforce.bucketing=true;
>  set hive.exec.dynamic.partition.mode = nonstrict;
>  set hive.support.concurrency=true;
>  set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> drop table if exists mergedelta_txt_1;
>  drop table if exists mergedelta_txt_2;
> CREATE TABLE mergedelta_txt_1 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> CREATE TABLE mergedelta_txt_2 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> INSERT INTO TABLE mergedelta_txt_1
>  partition(date_key=20170103)
>  VALUES
>  ("AB94LIENR0",46700,12345676836978),
>  ("AB94LIENR1",46825,12345676836978),
>  ("AB94LIENS0",46709,12345676836978),
>  ("AB94LIENS1",46834,12345676836978),
>  ("AB94LIENT0",46709,12345676836978),
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982);
> INSERT INTO TABLE mergedelta_txt_2
>  partition(date_key=20170103)
>  VALUES 
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982),
>  ("AB94LIERA2",47418,12345676836982),
>  ("AB94LIERB0",47176,12345676836982),
>  ("AB94LIERB1",47302,12345676836982),
>  ("AB94LIERB2",47418,12345676836982),
>  ("AB94LIERC0",47185,12345676836982);
> DROP TABLE IF EXISTS mergebase_1;
>  CREATE TABLE mergebase_1 (
>  id_str varchar(12) , time_key int , value bigint)
>  PARTITIONED BY (date_key int)
>  CLUSTERED BY (id_str,time_key) INTO 4 BUCKETS
>  STORED AS ORC
>  TBLPROPERTIES (
>  'orc.compress'='SNAPPY',
>  'pk_columns'='id_str,date_key,time_key',
>  'NO_AUTO_COMPACTION'='true',
>  'transactional'='true');
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, date_key, rank() OVER (PARTITION BY 
> id_str,date_key,time_key ORDER BY 

[jira] [Updated] (HIVE-20825) Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in length) causing the "Not a valid ORC file" error

2018-11-07 Thread Tom Zeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Zeng updated HIVE-20825:

Affects Version/s: 2.3.4

> Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in 
> length) causing the "Not a valid ORC file" error
> --
>
> Key: HIVE-20825
> URL: https://issues.apache.org/jira/browse/HIVE-20825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.3.1, 2.3.2, 2.3.4
> Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
> build of Hive 2.3.4
>Reporter: Tom Zeng
>Priority: Major
> Attachments: hive-merge-invalid-orc-repro.hql, 
> hive-merge-invalid-orc-repro.log
>
>
> When using Hive ACID Merge (supported with the ORC format) to update/insert 
> data, bucket files with 0 byte or 3 bytes (file content contains three 
> characters: ORC) are generated during MERGE INTO operations which finish with 
> no errors. Subsequent queries on the base table will get "Not a valid ORC 
> file" error.
>  
> The following script can be used to reproduce the issue(note that with small 
> amount of data like this increasing the number of buckets could result in 
> query working, but with large data set it will fail no matter what bucket 
> size):
> set hive.auto.convert.join=false;
>  set hive.enforce.bucketing=true;
>  set hive.exec.dynamic.partition.mode = nonstrict;
>  set hive.support.concurrency=true;
>  set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> drop table if exists mergedelta_txt_1;
>  drop table if exists mergedelta_txt_2;
> CREATE TABLE mergedelta_txt_1 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> CREATE TABLE mergedelta_txt_2 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> INSERT INTO TABLE mergedelta_txt_1
>  partition(date_key=20170103)
>  VALUES
>  ("AB94LIENR0",46700,12345676836978),
>  ("AB94LIENR1",46825,12345676836978),
>  ("AB94LIENS0",46709,12345676836978),
>  ("AB94LIENS1",46834,12345676836978),
>  ("AB94LIENT0",46709,12345676836978),
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982);
> INSERT INTO TABLE mergedelta_txt_2
>  partition(date_key=20170103)
>  VALUES 
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982),
>  ("AB94LIERA2",47418,12345676836982),
>  ("AB94LIERB0",47176,12345676836982),
>  ("AB94LIERB1",47302,12345676836982),
>  ("AB94LIERB2",47418,12345676836982),
>  ("AB94LIERC0",47185,12345676836982);
> DROP TABLE IF EXISTS mergebase_1;
>  CREATE TABLE mergebase_1 (
>  id_str varchar(12) , time_key int , value bigint)
>  PARTITIONED BY (date_key int)
>  CLUSTERED BY (id_str,time_key) INTO 4 BUCKETS
>  STORED AS ORC
>  TBLPROPERTIES (
>  'orc.compress'='SNAPPY',
>  'pk_columns'='id_str,date_key,time_key',
>  'NO_AUTO_COMPACTION'='true',
>  'transactional'='true');
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, date_key, rank() OVER (PARTITION BY 
> id_str,date_key,time_key ORDER BY id_str,date_key,time_key) AS rk 
>  FROM mergedelta_txt_1
>  DISTRIBUTE BY date_key
>  ) rankedtbl 
>  WHERE rankedtbl.rk=1
>  ) AS delta
>  ON delta.id_str=base.id_str AND delta.date_key=base.date_key AND 
> delta.time_key=base.time_key
>  WHEN MATCHED THEN UPDATE SET value=delta.value
>  WHEN NOT MATCHED THEN INSERT VALUES ( delta.id_str , delta.time_key , 
> delta.value, delta.date_key);
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, 

[jira] [Commented] (HIVE-20807) Refactor LlapStatusServiceDriver

2018-11-07 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678508#comment-16678508
 ] 

Miklos Gergely commented on HIVE-20807:
---

[~sershe] could you please review the changes?

> Refactor LlapStatusServiceDriver
> 
>
> Key: HIVE-20807
> URL: https://issues.apache.org/jira/browse/HIVE-20807
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20807.01.patch, HIVE-20807.02.patch, 
> HIVE-20807.03.patch, HIVE-20807.04.patch, HIVE-20807.05.patch, 
> HIVE-20807.06.patch
>
>
> LlapStatusServiceDriver is the class used to determine if LLAP has started. 
> The following problems should be solved by refactoring:
> 1. The main class is more than 800 lines long,should be cut into multiple 
> smaller classes.
> 2. The current design makes it extremely hard to write unit tests.
> 3. There are some overcomplicated, over-engineered parts of the code.
> 4. Most of the code is under org.apache.hadoop.hive.llap.cli, but some parts 
> are under org.apache.hadoop.hive.llap.cli.status. The whole program could be 
> moved to the latter.
> 5. LlapStatusHelpers serves as a class for holding classes, which doesn't 
> make much sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20825) Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in length) causing the "Not a valid ORC file" error

2018-11-07 Thread Tom Zeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678507#comment-16678507
 ] 

Tom Zeng commented on HIVE-20825:
-

[~ekoifman] can you try this on Hive 2.3.4? Looks like this is broken on 2.3.x. 
I haven't been able to get Hive 3+ working yet, will update here once I am able 
to run Hive 3_.

> Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in 
> length) causing the "Not a valid ORC file" error
> --
>
> Key: HIVE-20825
> URL: https://issues.apache.org/jira/browse/HIVE-20825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.3.1, 2.3.2
> Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
> build of Hive 2.3.4
>Reporter: Tom Zeng
>Priority: Major
> Attachments: hive-merge-invalid-orc-repro.hql, 
> hive-merge-invalid-orc-repro.log
>
>
> When using Hive ACID Merge (supported with the ORC format) to update/insert 
> data, bucket files with 0 byte or 3 bytes (file content contains three 
> characters: ORC) are generated during MERGE INTO operations which finish with 
> no errors. Subsequent queries on the base table will get "Not a valid ORC 
> file" error.
>  
> The following script can be used to reproduce the issue(note that with small 
> amount of data like this increasing the number of buckets could result in 
> query working, but with large data set it will fail no matter what bucket 
> size):
> set hive.auto.convert.join=false;
>  set hive.enforce.bucketing=true;
>  set hive.exec.dynamic.partition.mode = nonstrict;
>  set hive.support.concurrency=true;
>  set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> drop table if exists mergedelta_txt_1;
>  drop table if exists mergedelta_txt_2;
> CREATE TABLE mergedelta_txt_1 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> CREATE TABLE mergedelta_txt_2 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> INSERT INTO TABLE mergedelta_txt_1
>  partition(date_key=20170103)
>  VALUES
>  ("AB94LIENR0",46700,12345676836978),
>  ("AB94LIENR1",46825,12345676836978),
>  ("AB94LIENS0",46709,12345676836978),
>  ("AB94LIENS1",46834,12345676836978),
>  ("AB94LIENT0",46709,12345676836978),
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982);
> INSERT INTO TABLE mergedelta_txt_2
>  partition(date_key=20170103)
>  VALUES 
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982),
>  ("AB94LIERA2",47418,12345676836982),
>  ("AB94LIERB0",47176,12345676836982),
>  ("AB94LIERB1",47302,12345676836982),
>  ("AB94LIERB2",47418,12345676836982),
>  ("AB94LIERC0",47185,12345676836982);
> DROP TABLE IF EXISTS mergebase_1;
>  CREATE TABLE mergebase_1 (
>  id_str varchar(12) , time_key int , value bigint)
>  PARTITIONED BY (date_key int)
>  CLUSTERED BY (id_str,time_key) INTO 4 BUCKETS
>  STORED AS ORC
>  TBLPROPERTIES (
>  'orc.compress'='SNAPPY',
>  'pk_columns'='id_str,date_key,time_key',
>  'NO_AUTO_COMPACTION'='true',
>  'transactional'='true');
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, date_key, rank() OVER (PARTITION BY 
> id_str,date_key,time_key ORDER BY id_str,date_key,time_key) AS rk 
>  FROM mergedelta_txt_1
>  DISTRIBUTE BY date_key
>  ) rankedtbl 
>  WHERE rankedtbl.rk=1
>  ) AS delta
>  ON delta.id_str=base.id_str AND delta.date_key=base.date_key AND 
> delta.time_key=base.time_key
>  WHEN MATCHED THEN UPDATE SET value=delta.value
>  WHEN NOT MATCHED THEN 

[jira] [Updated] (HIVE-20825) Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in length) causing the "Not a valid ORC file" error

2018-11-07 Thread Tom Zeng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Zeng updated HIVE-20825:

Affects Version/s: (was: 2.2.0)
  Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
build of Hive 2.3.4  (was: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0)

> Hive ACID Merge generates invalid ORC files (bucket files 0 or 3 bytes in 
> length) causing the "Not a valid ORC file" error
> --
>
> Key: HIVE-20825
> URL: https://issues.apache.org/jira/browse/HIVE-20825
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.3.1, 2.3.2
> Environment: Hive 2.3.x on Amazon EMR 5.8.0 to 5.18.0. Open source 
> build of Hive 2.3.4
>Reporter: Tom Zeng
>Priority: Major
> Attachments: hive-merge-invalid-orc-repro.hql, 
> hive-merge-invalid-orc-repro.log
>
>
> When using Hive ACID Merge (supported with the ORC format) to update/insert 
> data, bucket files with 0 byte or 3 bytes (file content contains three 
> characters: ORC) are generated during MERGE INTO operations which finish with 
> no errors. Subsequent queries on the base table will get "Not a valid ORC 
> file" error.
>  
> The following script can be used to reproduce the issue(note that with small 
> amount of data like this increasing the number of buckets could result in 
> query working, but with large data set it will fail no matter what bucket 
> size):
> set hive.auto.convert.join=false;
>  set hive.enforce.bucketing=true;
>  set hive.exec.dynamic.partition.mode = nonstrict;
>  set hive.support.concurrency=true;
>  set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> drop table if exists mergedelta_txt_1;
>  drop table if exists mergedelta_txt_2;
> CREATE TABLE mergedelta_txt_1 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> CREATE TABLE mergedelta_txt_2 (
>  id_str varchar(12), time_key int, value bigint)
>  PARTITIONED BY (date_key int)
>  ROW FORMAT DELIMITED
>  STORED AS TEXTFILE;
> INSERT INTO TABLE mergedelta_txt_1
>  partition(date_key=20170103)
>  VALUES
>  ("AB94LIENR0",46700,12345676836978),
>  ("AB94LIENR1",46825,12345676836978),
>  ("AB94LIENS0",46709,12345676836978),
>  ("AB94LIENS1",46834,12345676836978),
>  ("AB94LIENT0",46709,12345676836978),
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982);
> INSERT INTO TABLE mergedelta_txt_2
>  partition(date_key=20170103)
>  VALUES 
>  ("AB94LIENT1",46834,12345676836978),
>  ("AB94LIENU0",46718,12345676836978),
>  ("AB94LIENU1",46844,12345676836978),
>  ("AB94LIENV0",46719,12345676836978),
>  ("AB94LIENV1",46844,12345676836978),
>  ("AB94LIENW0",46728,12345676836978),
>  ("AB94LIENW1",46854,12345676836978),
>  ("AB94LIENX0",46728,12345676836978),
>  ("AB94LIENX1",46854,12345676836978),
>  ("AB94LIENY0",46737,12345676836978),
>  ("AB94LIENY1",46863,12345676836978),
>  ("AB94LIENZ0",46738,12345676836978),
>  ("AB94LIENZ1",46863,12345676836978),
>  ("AB94LIERA0",47176,12345676836982),
>  ("AB94LIERA1",47302,12345676836982),
>  ("AB94LIERA2",47418,12345676836982),
>  ("AB94LIERB0",47176,12345676836982),
>  ("AB94LIERB1",47302,12345676836982),
>  ("AB94LIERB2",47418,12345676836982),
>  ("AB94LIERC0",47185,12345676836982);
> DROP TABLE IF EXISTS mergebase_1;
>  CREATE TABLE mergebase_1 (
>  id_str varchar(12) , time_key int , value bigint)
>  PARTITIONED BY (date_key int)
>  CLUSTERED BY (id_str,time_key) INTO 4 BUCKETS
>  STORED AS ORC
>  TBLPROPERTIES (
>  'orc.compress'='SNAPPY',
>  'pk_columns'='id_str,date_key,time_key',
>  'NO_AUTO_COMPACTION'='true',
>  'transactional'='true');
> MERGE INTO mergebase_1 AS base
>  USING (SELECT * 
>  FROM (
>  SELECT id_str ,time_key ,value, date_key, rank() OVER (PARTITION BY 
> id_str,date_key,time_key ORDER BY id_str,date_key,time_key) AS rk 
>  FROM mergedelta_txt_1
>  DISTRIBUTE BY date_key
>  ) rankedtbl 
>  WHERE rankedtbl.rk=1
>  ) AS delta
>  ON delta.id_str=base.id_str AND delta.date_key=base.date_key AND 
> delta.time_key=base.time_key
>  WHEN MATCHED THEN UPDATE SET value=delta.value
>  WHEN NOT MATCHED THEN INSERT VALUES ( delta.id_str , 

[jira] [Commented] (HIVE-20842) Fix logic introduced in HIVE-20660 to estimate statistics for group by

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678478#comment-16678478
 ] 

Hive QA commented on HIVE-20842:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
38s{color} | {color:blue} ql in master has 2315 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 3 new + 29 unchanged - 0 fixed 
= 32 total (was 29) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14792/dev-support/hive-personality.sh
 |
| git revision | master / 6d713b6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14792/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14792/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix logic introduced in HIVE-20660 to estimate statistics for group by
> --
>
> Key: HIVE-20842
> URL: https://issues.apache.org/jira/browse/HIVE-20842
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20842.1.patch, HIVE-20842.2.patch, 
> HIVE-20842.3.patch, HIVE-20842.4.patch, HIVE-20842.5.patch, HIVE-20842.6.patch
>
>
> HIVE-20660 introduced better estimation for group by operator. But the logic 
> did not account for Partial and Full group by separately.
> For partial group by parallelism (i.e. number of tasks) should be taken into 
> account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20360) QTest: ignore driver/qtest exclusions if -Dqfile param is set

2018-11-07 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-20360:

Attachment: HIVE-20360.04.patch

> QTest: ignore driver/qtest exclusions if -Dqfile param is set
> -
>
> Key: HIVE-20360
> URL: https://issues.apache.org/jira/browse/HIVE-20360
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-20360.01.patch, HIVE-20360.02.patch, 
> HIVE-20360.03.patch, HIVE-20360.04.patch
>
>
> Sometimes I need to run qtests with another driver for testing purposes. In 
> this case I have to edit testconfiguration.properties which seems a bit 
> hacky, even if it's temporary.
> In this case, no tests will run (however there's a log message):
> {code:java}
> mvn test -Pitests -pl itests/qtest -pl itests/util -Dtest=TestCliDriver 
> -Dqfile=bucketizedhiveinputformat.q
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18945) Support "analyze table T"

2018-11-07 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678438#comment-16678438
 ] 

Laszlo Bodor commented on HIVE-18945:
-

[~ashutoshc], [~kgyrtkirk] : i have been looking into this issue a bit, but in 
this case it could be abandoned

now I'm unassigning this

> Support "analyze table T"
> -
>
> Key: HIVE-18945
> URL: https://issues.apache.org/jira/browse/HIVE-18945
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Laszlo Bodor
>Priority: Major
>
> I think it would be good to have it behave the same as 
> {code}
> analyze table T compute statistics for columns
> {code}
> this could help people who not yet know the different analyze commands ; to 
> run the probably  most appropriate one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18945) Support "analyze table T"

2018-11-07 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-18945:
---

Assignee: (was: Laszlo Bodor)

> Support "analyze table T"
> -
>
> Key: HIVE-18945
> URL: https://issues.apache.org/jira/browse/HIVE-18945
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Priority: Major
>
> I think it would be good to have it behave the same as 
> {code}
> analyze table T compute statistics for columns
> {code}
> this could help people who not yet know the different analyze commands ; to 
> run the probably  most appropriate one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20744) Use SQL constraints to improve join reordering algorithm

2018-11-07 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678440#comment-16678440
 ] 

Hive QA commented on HIVE-20744:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12947157/HIVE-20744.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=85)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14791/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14791/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12947157 - PreCommit-HIVE-Build

> Use SQL constraints to improve join reordering algorithm
> 
>
> Key: HIVE-20744
> URL: https://issues.apache.org/jira/browse/HIVE-20744
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20744.01.patch, HIVE-20744.02.patch, 
> HIVE-20744.03.patch, HIVE-20744.04.patch, HIVE-20744.05.patch, 
> HIVE-20744.patch
>
>
> Till now, it was all based on stats stored for the base tables and their 
> columns. Now the optimizer can rely on constraints. Hence, this patch is for 
> the join reordering costing to use constraints, and if it does not find any, 
> rely on old code path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20033) Backport HIVE-19432 to branch-2, branch-3

2018-11-07 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor reassigned HIVE-20033:
---

Assignee: (was: Laszlo Bodor)

> Backport HIVE-19432 to branch-2, branch-3
> -
>
> Key: HIVE-20033
> URL: https://issues.apache.org/jira/browse/HIVE-20033
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20033.02-branch-3.patch, 
> HIVE-20033.03-branch-3.patch, HIVE-20033.1.branch-2.patch, 
> HIVE-20033.1.branch-3.patch
>
>
> Backport HIVE-19432 to branch-2, branch-3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20881) Constant propagation oversimplifies projections

2018-11-07 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-20881:
---


> Constant propagation oversimplifies projections
> ---
>
> Key: HIVE-20881
> URL: https://issues.apache.org/jira/browse/HIVE-20881
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code:java}
> create table cx2(bool1 boolean);
> insert into cx2 values (true),(false),(null);
> set hive.cbo.enable=true;
> select bool1 IS TRUE OR (cast(NULL as boolean) AND bool1 IS NOT TRUE AND 
> bool1 IS NOT FALSE) from cx2;
> ++
> |  _c0   |
> ++
> | true   |
> | false  |
> | NULL   |
> ++
> set hive.cbo.enable=false;
> select bool1 IS TRUE OR (cast(NULL as boolean) AND bool1 IS NOT TRUE AND 
> bool1 IS NOT FALSE) from cx2;
> +---+
> |  _c0  |
> +---+
> | true  |
> | NULL  |
> | NULL  |
> +---+
> {code}
> from explain it seems the expression was simplified to: {{(_col0 is true or 
> null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20880) Update default value for hive.stats.filter.in.min.ratio

2018-11-07 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-20880:

Status: Patch Available  (was: Open)

> Update default value for hive.stats.filter.in.min.ratio
> ---
>
> Key: HIVE-20880
> URL: https://issues.apache.org/jira/browse/HIVE-20880
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-20880.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >