[jira] [Commented] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559300#comment-16559300
 ] 

Hive QA commented on HIVE-20213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933277/HIVE-20213.03.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14813 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12889/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12889/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12889/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933277 - PreCommit-HIVE-Build

> Upgrade Calcite to 1.17.0
> -
>
> Key: HIVE-20213
> URL: https://issues.apache.org/jira/browse/HIVE-20213
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20213.01.patch, HIVE-20213.02.patch, 
> HIVE-20213.03.patch, HIVE-20213.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2018-07-26 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559275#comment-16559275
 ] 

mahesh kumar behera edited comment on HIVE-19829 at 7/27/18 5:30 AM:
-

[~sankarh]

The failure are same as older builds 

Two new failures(mm_all) are because of the changes done in previous commit -- 
HIVE-20182: Backport HIVE-20067 to branch-3 (Daniel Voros via Zoltan Haindrich) 
-- b429edbdfef3e6d609f0f3e226240b9be94797bf

{/code}
diff --git a/ql/src/test/queries/clientpositive/mm_all.q 
b/ql/src/test/queries/clientpositive/mm_all.q
index 61dd3e7475..a524c29ef5 100644
--- a/ql/src/test/queries/clientpositive/mm_all.q
+++ b/ql/src/test/queries/clientpositive/mm_all.q
@@ -3,6 +3,7 @@

 -- MASK_LINEAGE

+set hive.metastore.dml.events=true;

{/code}

Please check if the patch, HIVE-19829.12-branch-3.patch is fine to checkin to 
branch-3




was (Author: maheshk114):
[~sankarh]

The failure are same as older builds 

Two new failures(mm_all) are because of the changes done in previous commit -- 
HIVE-20182: Backport HIVE-20067 to branch-3 (Daniel Voros via Zoltan Haindrich) 
-- b429edbdfef3e6d609f0f3e226240b9be94797bf

diff --git a/ql/src/test/queries/clientpositive/mm_all.q 
b/ql/src/test/queries/clientpositive/mm_all.q
index 61dd3e7475..a524c29ef5 100644
--- a/ql/src/test/queries/clientpositive/mm_all.q
+++ b/ql/src/test/queries/clientpositive/mm_all.q
@@ -3,6 +3,7 @@

 -- MASK_LINEAGE

+set hive.metastore.dml.events=true;



Please check if the patch, HIVE-19829.12-branch-3.patch is fine to checkin to 
branch-3



> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch, 
> HIVE-19829.12-branch-3.patch
>
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2018-07-26 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559275#comment-16559275
 ] 

mahesh kumar behera edited comment on HIVE-19829 at 7/27/18 5:30 AM:
-

[~sankarh]

The failure are same as older builds 

Two new failures(mm_all) are because of the changes done in previous commit -- 
HIVE-20182: Backport HIVE-20067 to branch-3 (Daniel Voros via Zoltan Haindrich) 
-- b429edbdfef3e6d609f0f3e226240b9be94797bf

{\code}
diff --git a/ql/src/test/queries/clientpositive/mm_all.q 
b/ql/src/test/queries/clientpositive/mm_all.q
index 61dd3e7475..a524c29ef5 100644
--- a/ql/src/test/queries/clientpositive/mm_all.q
+++ b/ql/src/test/queries/clientpositive/mm_all.q
@@ -3,6 +3,7 @@

 -- MASK_LINEAGE

+set hive.metastore.dml.events=true;

{\code}

Please check if the patch, HIVE-19829.12-branch-3.patch is fine to checkin to 
branch-3




was (Author: maheshk114):
[~sankarh]

The failure are same as older builds 

Two new failures(mm_all) are because of the changes done in previous commit -- 
HIVE-20182: Backport HIVE-20067 to branch-3 (Daniel Voros via Zoltan Haindrich) 
-- b429edbdfef3e6d609f0f3e226240b9be94797bf

{/code}
diff --git a/ql/src/test/queries/clientpositive/mm_all.q 
b/ql/src/test/queries/clientpositive/mm_all.q
index 61dd3e7475..a524c29ef5 100644
--- a/ql/src/test/queries/clientpositive/mm_all.q
+++ b/ql/src/test/queries/clientpositive/mm_all.q
@@ -3,6 +3,7 @@

 -- MASK_LINEAGE

+set hive.metastore.dml.events=true;

{/code}

Please check if the patch, HIVE-19829.12-branch-3.patch is fine to checkin to 
branch-3



> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch, 
> HIVE-19829.12-branch-3.patch
>
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2018-07-26 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559275#comment-16559275
 ] 

mahesh kumar behera commented on HIVE-19829:


[~sankarh]

The failure are same as older builds 

Two new failures(mm_all) are because of the changes done in previous commit -- 
HIVE-20182: Backport HIVE-20067 to branch-3 (Daniel Voros via Zoltan Haindrich) 
-- b429edbdfef3e6d609f0f3e226240b9be94797bf

Please check if the patch, HIVE-19829.12-branch-3.patch is fine to checkin to 
branch-3



> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch, 
> HIVE-19829.12-branch-3.patch
>
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20244) forward port HIVE-19704 to master

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559274#comment-16559274
 ] 

Hive QA commented on HIVE-20244:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 65m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  5m 
31s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
25s{color} | {color:red} branch/storage-api cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
33s{color} | {color:red} branch/llap-server cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  8m  
0s{color} | {color:red} branch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
28s{color} | {color:red} llap-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} storage-api: The patch generated 0 new + 1 unchanged 
- 1 fixed = 1 total (was 2) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} llap-server: The patch generated 11 new + 234 
unchanged - 7 fixed = 245 total (was 241) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
48s{color} | {color:red} ql: The patch generated 3 new + 162 unchanged - 0 
fixed = 165 total (was 162) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
32s{color} | {color:red} patch/storage-api cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} patch/llap-server cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  6m 
41s{color} | {color:red} patch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
18s{color} | {color:red} storage-api generated 2 new + 28 unchanged - 0 fixed = 
30 total (was 28) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12888/dev-support/hive-personality.sh
 |
| git revision | master / 1ad4882 |
| Default Java | 1.8.0_111 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12888/yetus/branch-findbugs-storage-api.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12888/yetus/branch-findbugs-llap-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12888/yetus/branch-findbugs-ql.txt
 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12888/yetus/patch-mvninstall-llap-server.txt
 |
| 

[jira] [Updated] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2018-07-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16886:
--
Labels: TODOC3.0 pull-request-available  (was: TODOC3.0)

> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0, 2.3.2, 2.3.3
>Reporter: Sergio Peña
>Assignee: anishek
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16886.1.patch, HIVE-16886.2.patch, 
> HIVE-16886.3.patch, HIVE-16886.4.patch, HIVE-16886.5.patch, 
> HIVE-16886.6.patch, HIVE-16886.7.patch, HIVE-16886.8.patch, 
> datastore-identity-holes.diff
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19485) dump directory for non native tables should not be created

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559267#comment-16559267
 ] 

ASF GitHub Bot commented on HIVE-19485:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/350


> dump directory for non native tables should not be created
> --
>
> Key: HIVE-19485
> URL: https://issues.apache.org/jira/browse/HIVE-19485
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19485.0.patch, HIVE-19485.02-branch-3.patch, 
> HIVE-19485.03-branch-3.patch, HIVE-19485.04-branch-3.patch, 
> HIVE-19485.1.patch, HIVE-19485.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17410) repl load task during subsequent DAG generation does not start from the last partition processed

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559262#comment-16559262
 ] 

ASF GitHub Bot commented on HIVE-17410:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/240


> repl load task during subsequent DAG generation does not start from the last 
> partition processed
> 
>
> Key: HIVE-17410
> URL: https://issues.apache.org/jira/browse/HIVE-17410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17410.1.patch, HIVE-17410.2.patch, 
> HIVE-17410.3.patch
>
>
> DAG generation for repl load task was to be generated dynamically such that 
> if the load break happens at a partition load time then for subsequent runs 
> we should start post the last partition processed.
> We currently identify the point from where we have to process the event but 
> reinitialize the iterator to start from beginning of all partition's to 
> process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17410) repl load task during subsequent DAG generation does not start from the last partition processed

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559264#comment-16559264
 ] 

ASF GitHub Bot commented on HIVE-17410:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/240


> repl load task during subsequent DAG generation does not start from the last 
> partition processed
> 
>
> Key: HIVE-17410
> URL: https://issues.apache.org/jira/browse/HIVE-17410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17410.1.patch, HIVE-17410.2.patch, 
> HIVE-17410.3.patch
>
>
> DAG generation for repl load task was to be generated dynamically such that 
> if the load break happens at a partition load time then for subsequent runs 
> we should start post the last partition processed.
> We currently identify the point from where we have to process the event but 
> reinitialize the iterator to start from beginning of all partition's to 
> process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17410) repl load task during subsequent DAG generation does not start from the last partition processed

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559263#comment-16559263
 ] 

ASF GitHub Bot commented on HIVE-17410:
---

GitHub user anishek reopened a pull request:

https://github.com/apache/hive/pull/240

HIVE-17410 : repl load task during subsequent DAG generation does notstart 
from the last partition processed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-17410

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #240


commit a0f95051ff16e07272fd1b8cd3f12d386535c0ef
Author: Anishek Agarwal 
Date:   2017-08-30T00:03:39Z

HIVE-17410 : repl load task during subsequent DAG generation does not start 
from the last partition processed

commit d4dcadcb48b727e291a7cf43bc3380f40264e4d3
Author: Anishek Agarwal 
Date:   2017-09-08T05:54:09Z

HIVE-17410 : repl load task during subsequent DAG generation does not start 
from the last partition processed

setting up the replicationState Correctly.




> repl load task during subsequent DAG generation does not start from the last 
> partition processed
> 
>
> Key: HIVE-17410
> URL: https://issues.apache.org/jira/browse/HIVE-17410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17410.1.patch, HIVE-17410.2.patch, 
> HIVE-17410.3.patch
>
>
> DAG generation for repl load task was to be generated dynamically such that 
> if the load break happens at a partition load time then for subsequent runs 
> we should start post the last partition processed.
> We currently identify the point from where we have to process the event but 
> reinitialize the iterator to start from beginning of all partition's to 
> process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559265#comment-16559265
 ] 

ASF GitHub Bot commented on HIVE-16886:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/237


> HMS log notifications may have duplicated event IDs if multiple HMS are 
> running concurrently
> 
>
> Key: HIVE-16886
> URL: https://issues.apache.org/jira/browse/HIVE-16886
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0, 2.3.2, 2.3.3
>Reporter: Sergio Peña
>Assignee: anishek
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16886.1.patch, HIVE-16886.2.patch, 
> HIVE-16886.3.patch, HIVE-16886.4.patch, HIVE-16886.5.patch, 
> HIVE-16886.6.patch, HIVE-16886.7.patch, HIVE-16886.8.patch, 
> datastore-identity-holes.diff
>
>
> When running multiple Hive Metastore servers and DB notifications are 
> enabled, I could see that notifications can be persisted with a duplicated 
> event ID. 
> This does not happen when running multiple threads in a single HMS node due 
> to the locking acquired on the DbNotificationsLog class, but multiple HMS 
> could cause conflicts.
> The issue is in the ObjectStore#addNotificationEvent() method. The event ID 
> fetched from the datastore is used for the new notification, incremented in 
> the server itself, then persisted or updated back to the datastore. If 2 
> servers read the same ID, then these 2 servers write a new notification with 
> the same ID.
> The event ID is not unique nor a primary key.
> Here's a test case using the TestObjectStore class that confirms this issue:
> {noformat}
> @Test
>   public void testConcurrentAddNotifications() throws ExecutionException, 
> InterruptedException {
> final int NUM_THREADS = 2;
> CountDownLatch countIn = new CountDownLatch(NUM_THREADS);
> CountDownLatch countOut = new CountDownLatch(1);
> HiveConf conf = new HiveConf();
> conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, 
> MockPartitionExpressionProxy.class.getName());
> ExecutorService executorService = 
> Executors.newFixedThreadPool(NUM_THREADS);
> FutureTask tasks[] = new FutureTask[NUM_THREADS];
> for (int i=0; i   final int n = i;
>   tasks[i] = new FutureTask(new Callable() {
> @Override
> public Void call() throws Exception {
>   ObjectStore store = new ObjectStore();
>   store.setConf(conf);
>   NotificationEvent dbEvent =
>   new NotificationEvent(0, 0, 
> EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n);
>   System.out.println("ADDING NOTIFICATION");
>   countIn.countDown();
>   countOut.await();
>   store.addNotificationEvent(dbEvent);
>   System.out.println("FINISH NOTIFICATION");
>   return null;
> }
>   });
>   executorService.execute(tasks[i]);
> }
> countIn.await();
> countOut.countDown();
> for (int i = 0; i < NUM_THREADS; ++i) {
>   tasks[i].get();
> }
> NotificationEventResponse eventResponse = 
> objectStore.getNextNotification(new NotificationEventRequest());
> Assert.assertEquals(2, eventResponse.getEventsSize());
> Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId());
> // This fails because the next notification has an event ID = 1
> Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId());
>   }
> {noformat}
> The last assertion fails expecting an event ID 1 instead of 2. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17410) repl load task during subsequent DAG generation does not start from the last partition processed

2018-07-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17410:
--
Labels: pull-request-available  (was: )

> repl load task during subsequent DAG generation does not start from the last 
> partition processed
> 
>
> Key: HIVE-17410
> URL: https://issues.apache.org/jira/browse/HIVE-17410
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17410.1.patch, HIVE-17410.2.patch, 
> HIVE-17410.3.patch
>
>
> DAG generation for repl load task was to be generated dynamically such that 
> if the load break happens at a partition load time then for subsequent runs 
> we should start post the last partition processed.
> We currently identify the point from where we have to process the event but 
> reinitialize the iterator to start from beginning of all partition's to 
> process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559266#comment-16559266
 ] 

ASF GitHub Bot commented on HIVE-17825:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/262


> Socket not closed when trying to read files to copy over in replication from 
> metadata
> -
>
> Key: HIVE-17825
> URL: https://issues.apache.org/jira/browse/HIVE-17825
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17825.0.patch
>
>
> for replication we create a _files in hdfs which lists the source files to be 
> copied over for a table/partition. _files is read in ReplCopyTask to read 
> what files to be copied. The File operations w.r.t to _files is not correct 
> and we leave the files open there, which leads to a lot of CLOSE_WAIT 
> connections to the source Data nodes from HS2 on the replica cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559255#comment-16559255
 ] 

ASF GitHub Bot commented on HIVE-18341:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/289


> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch, 
> HIVE-18341.2.patch, HIVE-18341.3.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17829) ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559257#comment-16559257
 ] 

ASF GitHub Bot commented on HIVE-17829:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/283


> ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2
> --
>
> Key: HIVE-17829
> URL: https://issues.apache.org/jira/browse/HIVE-17829
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.1.0
>Reporter: Chiran Ravani
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17829.0.patch, HIVE-17829.1.patch
>
>
> Stack
> {code}
> 2017-10-09T09:39:54,804 ERROR [HiveServer2-Background-Pool: Thread-95]: 
> metadata.Table (Table.java:getColsInternal(642)) - Unable to get field from 
> serde: org.apache.hadoop.hive.hbase.HBaseSerDe
> java.lang.ArrayIndexOutOfBoundsException: 1
> at java.util.Arrays$ArrayList.get(Arrays.java:3841) ~[?:1.8.0_77]
> at 
> org.apache.hadoop.hive.serde2.BaseStructObjectInspector.init(BaseStructObjectInspector.java:104)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.init(LazySimpleStructObjectInspector.java:97)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.(LazySimpleStructObjectInspector.java:77)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory.getLazySimpleStructObjectInspector(LazyObjectInspectorFactory.java:115)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.hbase.HBaseLazyObjectFactory.createLazyHBaseStructInspector(HBaseLazyObjectFactory.java:79)
>  ~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:127) 
> ~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) 
> ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:531) 
> ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:424)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:411)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:279)
>  ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:261) 
> ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:639) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:622) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:833) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4228) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:347) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) 
> [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
> at 

[jira] [Commented] (HIVE-17615) Task.executeTask has to be thread safe for parallel execution

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559260#comment-16559260
 ] 

ASF GitHub Bot commented on HIVE-17615:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/259


> Task.executeTask has to be thread safe for parallel execution
> -
>
> Key: HIVE-17615
> URL: https://issues.apache.org/jira/browse/HIVE-17615
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17615.0.patch
>
>
> With parallel execution enabled we should make sure that the 
> {{Task.executeTask}} has to be thread safe, which is not the case with 
> hiveHistory object.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17426) Execution framework in hive to run tasks in parallel

2018-07-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17426:
--
Labels: pull-request-available  (was: )

> Execution framework in hive to run tasks in parallel
> 
>
> Key: HIVE-17426
> URL: https://issues.apache.org/jira/browse/HIVE-17426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17426.0.patch, HIVE-17426.1.patch, 
> HIVE-17426.2.patch, HIVE-17426.3.patch, HIVE-17426.4.patch, 
> HIVE-17426.5.patch, HIVE-17426.6.patch
>
>
> the execution framework currently only runs MR / Spark  Tasks in parallel 
> when {{set hive.exec.parallel=true}}.
> Allow other types of tasks to run in parallel as well to support replication 
> scenarios in hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17830) dbnotification fails to work with rdbms other than postgres

2018-07-26 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17830:
--
Labels: pull-request-available  (was: )

> dbnotification fails to work with rdbms other than postgres
> ---
>
> Key: HIVE-17830
> URL: https://issues.apache.org/jira/browse/HIVE-17830
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: anishek
>Assignee: Daniel Dai
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17830.0.patch, HIVE-17830.1.patch
>
>
> as part of HIVE-17721 we had changed the direct sql to acquire the lock for 
> postgres as
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update;
> {code}
> however this breaks other databases and we have to use different sql 
> statements for different databases 
> for postgres use
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update;
> {code}
> for SQLServer 
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" with (updlock);
> {code}
> for other databases 
> {code}
> select NEXT_EVENT_ID from NOTIFICATION_SEQUENCE for update;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559256#comment-16559256
 ] 

ASF GitHub Bot commented on HIVE-18352:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/286


> introduce a METADATAONLY option while doing REPL DUMP to allow integrations 
> of other tools 
> ---
>
> Key: HIVE-18352
> URL: https://issues.apache.org/jira/browse/HIVE-18352
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18352.0.patch, HIVE-18352.1.patch, 
> HIVE-18352.2.patch
>
>
> * Introduce a METADATAONLY option as part of the REPL DUMP command which will 
> only try and dump out events for DDL changes, this will be faster as we wont 
> need  scan of files on HDFS for DML changes. 
> * Additionally since we are only going to dump metadata operations, it might 
> be useful to include acid tables as well via an option as well. This option 
> can be removed when ACID support is complete via HIVE-18320
> it will be good to support the "WITH" clause as part of REPL DUMP command as 
> well (repl dump already supports it viaHIVE-17757) to achieve the above as 
> that will prevent less changes to the syntax of the statement and provide 
> more flexibility in future to include additional options as well. 
> {code}
> REPL DUMP [db_name] {FROM [event_id]} {TO [event_id]} {WITH 
> (['key'='value'],.)}
> {code}
> This will enable other tools like security / schema registry /  metadata 
> discovery to use replication related subsystem for their needs as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18467) support whole warehouse dump / load + create/drop database events

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559254#comment-16559254
 ] 

ASF GitHub Bot commented on HIVE-18467:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/300


> support whole warehouse dump / load + create/drop database events
> -
>
> Key: HIVE-18467
> URL: https://issues.apache.org/jira/browse/HIVE-18467
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18467.0.patch, HIVE-18467.1.patch, 
> HIVE-18467.2.patch, HIVE-18467.5.patch, HIVE-18467.6.patch
>
>
> A complete hive warehouse might be required to replicate to a DR site for 
> certain use cases and rather than allowing only a database name in the REPL 
> DUMP commands, we should allow dumping of all databases using the "*" option 
> as in 
> _REPL DUMP *_ 
> On the repl  load side there will not be an option to specify the database 
> name when loading from a location used to dump multiple databases, hence only 
> _REPL LOAD FROM [location]_ would be supported when dumping via _REPL DUMP *_
> Additionally, incremental dumps will go through all events across databases 
> in a warehouse and hence CREATE / DROP Database events have to be serialized 
> correctly to allow repl load to create them correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17830) dbnotification fails to work with rdbms other than postgres

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559258#comment-16559258
 ] 

ASF GitHub Bot commented on HIVE-17830:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/263


> dbnotification fails to work with rdbms other than postgres
> ---
>
> Key: HIVE-17830
> URL: https://issues.apache.org/jira/browse/HIVE-17830
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: anishek
>Assignee: Daniel Dai
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17830.0.patch, HIVE-17830.1.patch
>
>
> as part of HIVE-17721 we had changed the direct sql to acquire the lock for 
> postgres as
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update;
> {code}
> however this breaks other databases and we have to use different sql 
> statements for different databases 
> for postgres use
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update;
> {code}
> for SQLServer 
> {code}
> select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" with (updlock);
> {code}
> for other databases 
> {code}
> select NEXT_EVENT_ID from NOTIFICATION_SEQUENCE for update;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17426) Execution framework in hive to run tasks in parallel

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559261#comment-16559261
 ] 

ASF GitHub Bot commented on HIVE-17426:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/246


> Execution framework in hive to run tasks in parallel
> 
>
> Key: HIVE-17426
> URL: https://issues.apache.org/jira/browse/HIVE-17426
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17426.0.patch, HIVE-17426.1.patch, 
> HIVE-17426.2.patch, HIVE-17426.3.patch, HIVE-17426.4.patch, 
> HIVE-17426.5.patch, HIVE-17426.6.patch
>
>
> the execution framework currently only runs MR / Spark  Tasks in parallel 
> when {{set hive.exec.parallel=true}}.
> Allow other types of tasks to run in parallel as well to support replication 
> scenarios in hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20152) reset db state, when repl dump fails, so rename table can be done

2018-07-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559253#comment-16559253
 ] 

ASF GitHub Bot commented on HIVE-20152:
---

Github user anishek closed the pull request at:

https://github.com/apache/hive/pull/399


> reset db state, when repl dump fails, so rename table can be done
> -
>
> Key: HIVE-20152
> URL: https://issues.apache.org/jira/browse/HIVE-20152
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20152.1.patch, HIVE-20152.2.patch, 
> HIVE-20152.3.patch, HIVE-20152.4.patch
>
>
> If a repl dump command is run and it fails for some reason while doing table 
> level dumps, the state set on the db parameters is not reset and hence no 
> table / partition renames can be done. 
> the property to be reset is prefixed with key {code}bootstrap.dump.state 
> {code}
> and it should be unset. meanwhile the workaround is 
> {code}
> describe database extended [db_name]; 
> {code}
> assuming property is 'bootstrap.dump.state.something'
> {code}
> alter  database [db_name] set dbproperties 
> ('bootstrap.dump.state.something'='idle');"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19568) Active/Passive HS2 HA: Disallow direct connection to passive HS2 instance

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559248#comment-16559248
 ] 

Hive QA commented on HIVE-19568:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}347m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
37s{color} | {color:red} branch/itests/hive-unit cannot run setBugDatabaseInfo 
from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} branch/itests/util cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
20s{color} | {color:red} branch/service cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
35s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
44s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 22 
unchanged - 0 fixed = 25 total (was 22) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} itests/util: The patch generated 1 new + 16 unchanged 
- 0 fixed = 17 total (was 16) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
59s{color} | {color:red} service: The patch generated 2 new + 96 unchanged - 6 
fixed = 98 total (was 102) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
35s{color} | {color:red} patch/itests/hive-unit cannot run setBugDatabaseInfo 
from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
47s{color} | {color:red} patch/itests/util cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
36s{color} | {color:red} patch/service cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}378m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12884/dev-support/hive-personality.sh
 |
| git revision | master / 5a3f12d |
| Default Java | 1.8.0_111 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12884/yetus/branch-findbugs-itests_hive-unit.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12884/yetus/branch-findbugs-itests_util.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12884/yetus/branch-findbugs-service.txt
 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12884/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| checkstyle | 

[jira] [Commented] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-07-26 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559246#comment-16559246
 ] 

Miklos Gergely commented on HIVE-19694:
---

Patch ready, also removed almost 100 unused lines from the SemanticAnalyzer.

> Create Materialized View statement should check for MV name conflicts before 
> running MV's SQL statement. 
> -
>
> Key: HIVE-19694
> URL: https://issues.apache.org/jira/browse/HIVE-19694
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Nita Dembla
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-19694.01.patch
>
>
> If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
> exists, the statement runs the SQL on cluster and Move task returns an error 
> at the very end.
> This unnecessarily uses up cluster resources and user time.
>  
> {code:java}
> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
> mv_store_sales_item_store
> . . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
> . . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as 
> ss_quantity,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
> ss_ext_wholesale_cost,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as 
> ss_net_paid,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
> ss_net_profit
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> |   `ss_store_sk` bigint,    |
> |   `ss_quantity` bigint,    |
> |   `ss_ext_wholesale_cost` double,  |
> |   `ss_net_paid` double,    |
> |   `ss_net_profit` double)  |
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_quantity, type:bigint, comment:null), 
> FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
> FieldSchema(name:ss_net_paid, type:double, comment:null), 
> FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); 
> Time taken: 3.652 seconds
> INFO  : Executing 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Subscribed to counters: [] for queryId: 
> root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Session is already open
> INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
> ) (Stage-1)
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1525123931791_0151)
> --
>     VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> 

[jira] [Updated] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-07-26 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-19694:
--
Attachment: HIVE-19694.01.patch

> Create Materialized View statement should check for MV name conflicts before 
> running MV's SQL statement. 
> -
>
> Key: HIVE-19694
> URL: https://issues.apache.org/jira/browse/HIVE-19694
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Nita Dembla
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-19694.01.patch
>
>
> If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
> exists, the statement runs the SQL on cluster and Move task returns an error 
> at the very end.
> This unnecessarily uses up cluster resources and user time.
>  
> {code:java}
> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
> mv_store_sales_item_store
> . . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
> . . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as 
> ss_quantity,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
> ss_ext_wholesale_cost,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as 
> ss_net_paid,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
> ss_net_profit
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> |   `ss_store_sk` bigint,    |
> |   `ss_quantity` bigint,    |
> |   `ss_ext_wholesale_cost` double,  |
> |   `ss_net_paid` double,    |
> |   `ss_net_profit` double)  |
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_quantity, type:bigint, comment:null), 
> FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
> FieldSchema(name:ss_net_paid, type:double, comment:null), 
> FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); 
> Time taken: 3.652 seconds
> INFO  : Executing 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Subscribed to counters: [] for queryId: 
> root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Session is already open
> INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
> ) (Stage-1)
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1525123931791_0151)
> --
>     VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 ..  llap 

[jira] [Updated] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-07-26 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-19694:
--
Status: Patch Available  (was: Open)

> Create Materialized View statement should check for MV name conflicts before 
> running MV's SQL statement. 
> -
>
> Key: HIVE-19694
> URL: https://issues.apache.org/jira/browse/HIVE-19694
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Nita Dembla
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-19694.01.patch
>
>
> If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
> exists, the statement runs the SQL on cluster and Move task returns an error 
> at the very end.
> This unnecessarily uses up cluster resources and user time.
>  
> {code:java}
> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
> mv_store_sales_item_store
> . . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
> . . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as 
> ss_quantity,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
> ss_ext_wholesale_cost,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as 
> ss_net_paid,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
> ss_net_profit
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> |   `ss_store_sk` bigint,    |
> |   `ss_quantity` bigint,    |
> |   `ss_ext_wholesale_cost` double,  |
> |   `ss_net_paid` double,    |
> |   `ss_net_profit` double)  |
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_quantity, type:bigint, comment:null), 
> FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
> FieldSchema(name:ss_net_paid, type:double, comment:null), 
> FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); 
> Time taken: 3.652 seconds
> INFO  : Executing 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Subscribed to counters: [] for queryId: 
> root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Session is already open
> INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
> ) (Stage-1)
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1525123931791_0151)
> --
>     VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 ..  llap   

[jira] [Updated] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-07-26 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-19694:
--
Attachment: (was: HIVE-19694.patch)

> Create Materialized View statement should check for MV name conflicts before 
> running MV's SQL statement. 
> -
>
> Key: HIVE-19694
> URL: https://issues.apache.org/jira/browse/HIVE-19694
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Nita Dembla
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
>
> If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
> exists, the statement runs the SQL on cluster and Move task returns an error 
> at the very end.
> This unnecessarily uses up cluster resources and user time.
>  
> {code:java}
> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
> mv_store_sales_item_store
> . . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
> . . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as 
> ss_quantity,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
> ss_ext_wholesale_cost,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as 
> ss_net_paid,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
> ss_net_profit
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> |   `ss_store_sk` bigint,    |
> |   `ss_quantity` bigint,    |
> |   `ss_ext_wholesale_cost` double,  |
> |   `ss_net_paid` double,    |
> |   `ss_net_profit` double)  |
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_quantity, type:bigint, comment:null), 
> FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
> FieldSchema(name:ss_net_paid, type:double, comment:null), 
> FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); 
> Time taken: 3.652 seconds
> INFO  : Executing 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Subscribed to counters: [] for queryId: 
> root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Session is already open
> INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
> ) (Stage-1)
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1525123931791_0151)
> --
>     VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 ..  llap SUCCEEDED   1682   1682    0

[jira] [Updated] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-07-26 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-19694:
--
Attachment: HIVE-19694.patch

> Create Materialized View statement should check for MV name conflicts before 
> running MV's SQL statement. 
> -
>
> Key: HIVE-19694
> URL: https://issues.apache.org/jira/browse/HIVE-19694
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Nita Dembla
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
>
> If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
> exists, the statement runs the SQL on cluster and Move task returns an error 
> at the very end.
> This unnecessarily uses up cluster resources and user time.
>  
> {code:java}
> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
> mv_store_sales_item_store
> . . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
> . . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as 
> ss_quantity,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
> ss_ext_wholesale_cost,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as 
> ss_net_paid,
> . . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
> ss_net_profit
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> |   `ss_store_sk` bigint,    |
> |   `ss_quantity` bigint,    |
> |   `ss_ext_wholesale_cost` double,  |
> |   `ss_net_paid` double,    |
> |   `ss_net_profit` double)  |
> . . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
> . . . . . . . . . . . . . . . . . . . . . . .>  group by 
> ss_item_sk,ss_store_sk
> . . . . . . . . . . . . . . . . . . . . . . .>  );
> INFO  : Compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
> FieldSchema(name:ss_quantity, type:bigint, comment:null), 
> FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
> FieldSchema(name:ss_net_paid, type:double, comment:null), 
> FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); 
> Time taken: 3.652 seconds
> INFO  : Executing 
> command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
> CREATE MATERIALIZED VIEW mv_store_sales_item_store
> ENABLE REWRITE AS (
> select ss_item_sk,
> ss_store_sk,
> sum(ss_quantity) as ss_quantity,
> sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
> sum(ss_net_paid) as ss_net_paid,
> sum(ss_net_profit) as ss_net_profit
> from store_sales
> group by ss_item_sk,ss_store_sk
> )
> INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Subscribed to counters: [] for queryId: 
> root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
> INFO  : Session is already open
> INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
> ) (Stage-1)
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1525123931791_0151)
> --
>     VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 ..  llap SUCCEEDED   1682   1682    0    0  
>  

[jira] [Commented] (HIVE-20244) forward port HIVE-19704 to master

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559235#comment-16559235
 ] 

Hive QA commented on HIVE-20244:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933262/HIVE-20244.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14812 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking]
 (batchId=193)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12888/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12888/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12888/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933262 - PreCommit-HIVE-Build

> forward port HIVE-19704 to master
> -
>
> Key: HIVE-20244
> URL: https://issues.apache.org/jira/browse/HIVE-20244
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20244.01.patch, HIVE-20244.patch
>
>
> Apparently this logic is still there and can be engaged in some cases, like 
> when one file takes the entire cache from a single large read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559197#comment-16559197
 ] 

Hive QA commented on HIVE-20239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933264/HIVE-20239.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14812 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12887/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12887/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12887/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933264 - PreCommit-HIVE-Build

> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559178#comment-16559178
 ] 

Hive QA commented on HIVE-20239:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
52s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12887/dev-support/hive-personality.sh
 |
| git revision | master / 1ad4882 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12887/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20156) Printing Stacktrace to STDERR

2018-07-26 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559170#comment-16559170
 ] 

Andrew Sherman commented on HIVE-20156:
---

Thanks for looking at this, [~ngangam], please push to master at your 
convenience

> Printing Stacktrace to STDERR
> -
>
> Key: HIVE-20156
> URL: https://issues.apache.org/jira/browse/HIVE-20156
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20156.1.patch
>
>
> Class {{org.apache.hadoop.hive.ql.exec.JoinOperator}} has the following code:
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> {code}
> Do not print the stack trace to STDERR with a call to {{printStackTrace()}}.  
> Please remove that line and let the code catching the {{HiveException}} worry 
> about printing any messages through a logger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559163#comment-16559163
 ] 

Hive QA commented on HIVE-20153:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933260/HIVE-20153.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14812 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12886/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12886/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12886/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933260 - PreCommit-HIVE-Build

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-20153.1.patch, Screen Shot 2018-07-12 at 6.41.28 
> PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559162#comment-16559162
 ] 

Rui Li commented on HIVE-20032:
---

+1

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch, 
> HIVE-20032.9.patch, HIVE-20032.91.patch, HIVE-20032.92.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559161#comment-16559161
 ] 

Hive QA commented on HIVE-20153:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 58m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
57s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12886/dev-support/hive-personality.sh
 |
| git revision | master / 1ad4882 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12886/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-20153.1.patch, Screen Shot 2018-07-12 at 6.41.28 
> PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20253) nativetask cann't working in hive

2018-07-26 Thread gehaijiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gehaijiang updated HIVE-20253:
--
Description: 
hadoop  3.0.3, Support nativetask.  

mapred-site.xml: 


       mapreduce.job.map.output.collector.class
   
org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator
 

 

hive sql: 

set 
mapreduce.job.map.output.collector.class=org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator;
 select count(*) from test_cold;   --test_cold  (orcfile table)

 

URL:
 
[http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1532646043398_0019=task_1532646043398_0019_m_00]

Diagnostic Messages for this Task:
 Error: java.io.IOException: Initialization of all the collectors failed. Error 
in last collector was:java.io.IOException: Cannot find serializer for 
org.apache.hadoop.hive.ql.io.HiveKey
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:423)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:454)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
 Caused by: java.io.IOException: Cannot find serializer for 
org.apache.hadoop.hive.ql.io.HiveKey
 at 
org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator.init(NativeMapOutputCollectorDelegator.java:127)
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:408)
 ... 7 more

 

2018-07-27 10:08:25,391 ERROR operation.Operation (SQLOperation.java:run(209)) 
- Error running hive query:
 org.apache.hive.service.cli.HiveSQLException: Error while processing 
statement: FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:316)
 at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156)
 at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
 at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)

  was:
hadoop  3.0.3, Support nativetask.  

mapred-site.xml: 


 mapreduce.job.map.output.collector.class
 
org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator
 

 

hive sql: 

set 
mapreduce.job.map.output.collector.class=org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator;
select count(*) from test_cold;

 

URL:
 
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1532646043398_0019=task_1532646043398_0019_m_00
-
Diagnostic Messages for this Task:
Error: java.io.IOException: Initialization of all the collectors failed. Error 
in last collector was:java.io.IOException: Cannot find serializer for 
org.apache.hadoop.hive.ql.io.HiveKey
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:423)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:454)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.io.IOException: Cannot find serializer for 
org.apache.hadoop.hive.ql.io.HiveKey
 at 
org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator.init(NativeMapOutputCollectorDelegator.java:127)
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:408)
 ... 7 more

 

2018-07-27 10:08:25,391 ERROR operation.Operation (SQLOperation.java:run(209)) 
- Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:316)
 at 

[jira] [Commented] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559153#comment-16559153
 ] 

Ashutosh Chauhan commented on HIVE-20213:
-

Yeah.. RB would be useful. Can you please create RB request?

> Upgrade Calcite to 1.17.0
> -
>
> Key: HIVE-20213
> URL: https://issues.apache.org/jira/browse/HIVE-20213
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20213.01.patch, HIVE-20213.02.patch, 
> HIVE-20213.03.patch, HIVE-20213.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559148#comment-16559148
 ] 

Hive QA commented on HIVE-20213:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}207m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
28s{color} | {color:red} branch/storage-api cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
29s{color} | {color:red} branch/druid-handler cannot run setBugDatabaseInfo 
from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} branch/jdbc-handler cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
13s{color} | {color:red} branch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} druid-handler: The patch generated 10 new + 135 
unchanged - 4 fixed = 145 total (was 139) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 3 new + 63 unchanged - 1 fixed 
= 66 total (was 64) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} patch/storage-api cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
27s{color} | {color:red} patch/druid-handler cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
19s{color} | {color:red} patch/jdbc-handler cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
2s{color} | {color:red} patch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}270m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12883/dev-support/hive-personality.sh
 |
| git revision | master / 2d097dc |
| Default Java | 1.8.0_111 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12883/yetus/branch-findbugs-storage-api.txt
 |
| findbugs | 

[jira] [Commented] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559129#comment-16559129
 ] 

Sergey Shelukhin commented on HIVE-20247:
-

01 patch addresses another potential issue where consumer's refcount for cache 
buffers is not released for partially read stream on error. Needs to be tested 
on the cluster.

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.01.patch, HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one path, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20248:

Attachment: HIVE-20248.01.patch

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Attachment: HIVE-20247.01.patch

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.01.patch, HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one path, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20248:

Attachment: (was: HIVE-20248.01.patch)

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20240) Semijoin Reduction : Use local variable to check for external table condition

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559095#comment-16559095
 ] 

Hive QA commented on HIVE-20240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933250/HIVE-20240.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14812 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12885/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12885/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12885/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933250 - PreCommit-HIVE-Build

> Semijoin Reduction : Use local variable to check for external table condition
> -
>
> Key: HIVE-20240
> URL: https://issues.apache.org/jira/browse/HIVE-20240
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20240.1.patch
>
>
> This condition,
>  
> semiJoin = semiJoin && 
> !disableSemiJoinOptDueToExternalTable(parseContext.getConf(), ts, ctx);
>  
> may set semiJoin to false if an external table is encountered and will remain 
> false for subsequent cases. It should only disable it for that particular 
> case.
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20252:
---
Description: 
For eg,

 {noformat}
 # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
 # 
TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
 #                                                           
-SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
 # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
 # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
 # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
 # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
 #                       -SEL[131]-GBY[132]-EVENT[133]
 # 
TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
 # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
 #                       -SEL[139]-GBY[140]-EVENT[141]
 # 
TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
 # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
 #                       -SEL[147]-GBY[148]-EVENT[149]
 # 
 # 
 # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}
{noformat}
 

The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
MAPJOIN[163] going back to parent of the semi join branch at line 2.


The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 which 
could cause the logic to look for wrong TS. The logic to find TS operator 
upstream must use findOperatorsUpstream() and examine each TS Op for complete 
coverage.

 

cc [~jcamachorodriguez]

  was:
For eg,

 
 # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
 # 
TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
 #                                                           
-SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
 # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
 # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
 # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
 # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
 #                       -SEL[131]-GBY[132]-EVENT[133]
 # 
TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
 # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
 #                       -SEL[139]-GBY[140]-EVENT[141]
 # 
TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
 # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
 #                       -SEL[147]-GBY[148]-EVENT[149]
 # 
 # 
 # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}

 

The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
MAPJOIN[163] going back to parent of the semi join branch at line 2.


The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 which 
could cause the logic to look for wrong TS. The logic to find TS operator 
upstream must use findOperatorsUpstream() and examine each TS Op for complete 
coverage.

 

cc [~jcamachorodriguez]


> Semijoin Reduction : Cycles due to semi join branch may remain undetected if 
> small table side has a map join upstream.
> --
>
> Key: HIVE-20252
> URL: https://issues.apache.org/jira/browse/HIVE-20252
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20252.1.patch
>
>
> For eg,
>  {noformat}
>  # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
> optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
>  # 
> TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
>  #                                                           
> -SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
>  # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
>  # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
>  # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
>  # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
>  #                       -SEL[131]-GBY[132]-EVENT[133]
>  # 
> TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
>  # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
>  #                       

[jira] [Updated] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20252:
--
Attachment: HIVE-20252.1.patch

> Semijoin Reduction : Cycles due to semi join branch may remain undetected if 
> small table side has a map join upstream.
> --
>
> Key: HIVE-20252
> URL: https://issues.apache.org/jira/browse/HIVE-20252
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20252.1.patch
>
>
> For eg,
>  
>  # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
> optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
>  # 
> TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
>  #                                                           
> -SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
>  # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
>  # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
>  # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
>  # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
>  #                       -SEL[131]-GBY[132]-EVENT[133]
>  # 
> TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
>  # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
>  #                       -SEL[139]-GBY[140]-EVENT[141]
>  # 
> TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
>  # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
>  #                       -SEL[147]-GBY[148]-EVENT[149]
>  # 
>  # 
>  # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
> TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}
>  
> The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
> MAPJOIN[163] going back to parent of the semi join branch at line 2.
> The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 
> which could cause the logic to look for wrong TS. The logic to find TS 
> operator upstream must use findOperatorsUpstream() and examine each TS Op for 
> complete coverage.
>  
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20252:
--
Status: Patch Available  (was: In Progress)

> Semijoin Reduction : Cycles due to semi join branch may remain undetected if 
> small table side has a map join upstream.
> --
>
> Key: HIVE-20252
> URL: https://issues.apache.org/jira/browse/HIVE-20252
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> For eg,
>  
>  # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
> optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
>  # 
> TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
>  #                                                           
> -SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
>  # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
>  # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
>  # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
>  # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
>  #                       -SEL[131]-GBY[132]-EVENT[133]
>  # 
> TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
>  # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
>  #                       -SEL[139]-GBY[140]-EVENT[141]
>  # 
> TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
>  # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
>  #                       -SEL[147]-GBY[148]-EVENT[149]
>  # 
>  # 
>  # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
> TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}
>  
> The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
> MAPJOIN[163] going back to parent of the semi join branch at line 2.
> The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 
> which could cause the logic to look for wrong TS. The logic to find TS 
> operator upstream must use findOperatorsUpstream() and examine each TS Op for 
> complete coverage.
>  
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-20252:
-


> Semijoin Reduction : Cycles due to semi join branch may remain undetected if 
> small table side has a map join upstream.
> --
>
> Key: HIVE-20252
> URL: https://issues.apache.org/jira/browse/HIVE-20252
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> For eg,
>  
>  # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
> optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
>  # 
> TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
>  #                                                           
> -SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
>  # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
>  # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
>  # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
>  # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
>  #                       -SEL[131]-GBY[132]-EVENT[133]
>  # 
> TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
>  # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
>  #                       -SEL[139]-GBY[140]-EVENT[141]
>  # 
> TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
>  # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
>  #                       -SEL[147]-GBY[148]-EVENT[149]
>  # 
>  # 
>  # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
> TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}
>  
> The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
> MAPJOIN[163] going back to parent of the semi join branch at line 2.
> The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 
> which could cause the logic to look for wrong TS. The logic to find TS 
> operator upstream must use findOperatorsUpstream() and examine each TS Op for 
> complete coverage.
>  
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-20252) Semijoin Reduction : Cycles due to semi join branch may remain undetected if small table side has a map join upstream.

2018-07-26 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-20252 started by Deepak Jaiswal.
-
> Semijoin Reduction : Cycles due to semi join branch may remain undetected if 
> small table side has a map join upstream.
> --
>
> Key: HIVE-20252
> URL: https://issues.apache.org/jira/browse/HIVE-20252
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> For eg,
>  
>  # 2018-07-26T17:22:14,664 DEBUG [51377701-dc98-424f-82e0-bbb5d6c84316 main] 
> optimizer.SharedWorkOptimizer: Before SharedWorkOptimizer:
>  # 
> TS[0]-FIL[96]-SEL[2]-MAPJOIN[156]-MAPJOIN[157]-MAPJOIN[161]-MAPJOIN[162]-FIL[47]-SEL[48]-MAPJOIN[163]-FIL[66]-SEL[67]-TNK[105]-GBY[68]-RS[69]-GBY[70]-SEL[71]-RS[72]-SEL[73]-LIM[74]-FS[75]
>  #                                                           
> -SEL[142]-GBY[143]-RS[144]-GBY[145]-RS[155]
>  # TS[3]-FIL[97]-SEL[5]-RS[34]-MAPJOIN[156]
>  # TS[6]-FIL[98]-SEL[8]-RS[37]-MAPJOIN[157]
>  # TS[9]-FIL[99]-SEL[11]-MAPJOIN[158]-GBY[40]-RS[42]-MAPJOIN[161]
>  # TS[12]-FIL[100]-SEL[14]-RS[16]-MAPJOIN[158]
>  #                       -SEL[131]-GBY[132]-EVENT[133]
>  # 
> TS[19]-FIL[101]-SEL[21]-MAPJOIN[159]-GBY[29]-RS[30]-GBY[31]-SEL[32]-RS[45]-MAPJOIN[162]
>  # TS[22]-FIL[102]-SEL[24]-RS[26]-MAPJOIN[159]
>  #                       -SEL[139]-GBY[140]-EVENT[141]
>  # 
> TS[49]-FIL[103]-SEL[51]-MAPJOIN[160]-GBY[59]-RS[60]-GBY[61]-SEL[62]-RS[64]-MAPJOIN[163]
>  # TS[52]-FIL[104]-SEL[54]-RS[56]-MAPJOIN[160]
>  #                       -SEL[147]-GBY[148]-EVENT[149]
>  # 
>  # 
>  # DPP information stored in the cache: \{TS[19]=[EVENT[141]], 
> TS[9]=[EVENT[133]], TS[49]=[RS[155], EVENT[149]]}
>  
> The semi join branch in line 3 feeds into TS[49] in line 12 which feeds to 
> MAPJOIN[163] going back to parent of the semi join branch at line 2.
> The logic to detect cycle may fail as there is a MAPJOIN[160] at line 12 
> which could cause the logic to look for wrong TS. The logic to find TS 
> operator upstream must use findOperatorsUpstream() and examine each TS Op for 
> complete coverage.
>  
> cc [~jcamachorodriguez]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20240) Semijoin Reduction : Use local variable to check for external table condition

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559074#comment-16559074
 ] 

Hive QA commented on HIVE-20240:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
19s{color} | {color:blue} ql in master has 2296 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12885/dev-support/hive-personality.sh
 |
| git revision | master / 1ad4882 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12885/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Semijoin Reduction : Use local variable to check for external table condition
> -
>
> Key: HIVE-20240
> URL: https://issues.apache.org/jira/browse/HIVE-20240
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20240.1.patch
>
>
> This condition,
>  
> semiJoin = semiJoin && 
> !disableSemiJoinOptDueToExternalTable(parseContext.getConf(), ts, ctx);
>  
> may set semiJoin to false if an external table is encountered and will remain 
> false for subsequent cases. It should only disable it for that particular 
> case.
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20249:
-
Attachment: HIVE-20249.2.patch

> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20249.1.patch, HIVE-20249.2.patch
>
>
> NPE on deallocating buffers
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559066#comment-16559066
 ] 

Sergey Shelukhin commented on HIVE-20249:
-

Hmm.. I'm not sure lockedBufs logging should be added to iomem by default. It 
can have 1000s of buffers when queries are running. Looks good otherwise.

> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20249.1.patch
>
>
> NPE on deallocating buffers
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20251) Improve message in SharedWorkOptimizer when cycles are found in the plan

2018-07-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-20251:
--

Assignee: (was: Jesus Camacho Rodriguez)

> Improve message in SharedWorkOptimizer when cycles are found in the plan
> 
>
> Key: HIVE-20251
> URL: https://issues.apache.org/jira/browse/HIVE-20251
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, if there is a cycle, e.g., due to semijoin, which should not 
> happen, SharedWorkOptimizer will just loop infinitely. It would be better to 
> throw an Exception in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20251) Improve message in SharedWorkOptimizer when cycles are found in the plan

2018-07-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-20251:
--


> Improve message in SharedWorkOptimizer when cycles are found in the plan
> 
>
> Key: HIVE-20251
> URL: https://issues.apache.org/jira/browse/HIVE-20251
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, if there is a cycle, e.g., due to semijoin, which should not 
> happen, SharedWorkOptimizer will just loop infinitely. It would be better to 
> throw an Exception in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20250) Option to allow external tables to use query results cache

2018-07-26 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559062#comment-16559062
 ] 

Jason Dere commented on HIVE-20250:
---

Patch to enable caching of external tables by setting both 
hive.query.results.cache.nontransactional.tables.enabled and 
hive.query.results.cache.external.tables.enabled. 

> Option to allow external tables to use query results cache
> --
>
> Key: HIVE-20250
> URL: https://issues.apache.org/jira/browse/HIVE-20250
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Priority: Major
> Attachments: HIVE-20250.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19814) RPC Server port is always random for spark

2018-07-26 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559061#comment-16559061
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-19814:
-

Uploading patch for this issue.

Added back the code for putting HIVE_SPARK_RSC_CONFIGS from HiveConf to spark 
conf in config.HiveSparkClient createHiveSparkClient. This part was removed in 
HIVE-18958 but looks like it is needed for the configs to be available to 
RpcServer.

Also added SPARK_RPC_SERVER_PORT to HIVE_SPARK_RSC_CONFIGS.

> RPC Server port is always random for spark
> --
>
> Key: HIVE-19814
> URL: https://issues.apache.org/jira/browse/HIVE-19814
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.3.0, 3.0.0, 2.4.0, 4.0.0
>Reporter: bounkong khamphousone
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19814.1.patch
>
>
> RPC server port is always a random one. In fact, the problem is in 
> RpcConfiguration.HIVE_SPARK_RSC_CONFIGS which doesn't include 
> SPARK_RPC_SERVER_PORT.
>  
> I've found this issue while trying to make hive-on-spark running inside 
> docker.
>  
> HIVE_SPARK_RSC_CONFIGS is called by HiveSparkClientFactory.initiateSparkConf 
> > SparkSessionManagerImpl.setup and the latter call 
> SparkClientFactory.initialize(conf) which initialize the rpc server. This 
> RPCServer is then used to create the sparkClient which use the rpc server 
> port as --remote-port arg. Since initiateSparkConf ignore 
> SPARK_RPC_SERVER_PORT, then it will always be a random port.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20250) Option to allow external tables to use query results cache

2018-07-26 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20250:
--
Attachment: HIVE-20250.1.patch

> Option to allow external tables to use query results cache
> --
>
> Key: HIVE-20250
> URL: https://issues.apache.org/jira/browse/HIVE-20250
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Priority: Major
> Attachments: HIVE-20250.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559059#comment-16559059
 ] 

Eugene Koifman commented on HIVE-20248:
---

+1

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19814) RPC Server port is always random for spark

2018-07-26 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-19814:

Attachment: HIVE-19814.1.patch
Status: Patch Available  (was: Open)

> RPC Server port is always random for spark
> --
>
> Key: HIVE-19814
> URL: https://issues.apache.org/jira/browse/HIVE-19814
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.0.0, 2.3.0, 2.4.0, 4.0.0
>Reporter: bounkong khamphousone
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19814.1.patch
>
>
> RPC server port is always a random one. In fact, the problem is in 
> RpcConfiguration.HIVE_SPARK_RSC_CONFIGS which doesn't include 
> SPARK_RPC_SERVER_PORT.
>  
> I've found this issue while trying to make hive-on-spark running inside 
> docker.
>  
> HIVE_SPARK_RSC_CONFIGS is called by HiveSparkClientFactory.initiateSparkConf 
> > SparkSessionManagerImpl.setup and the latter call 
> SparkClientFactory.initialize(conf) which initialize the rpc server. This 
> RPCServer is then used to create the sparkClient which use the rpc server 
> port as --remote-port arg. Since initiateSparkConf ignore 
> SPARK_RPC_SERVER_PORT, then it will always be a random port.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19568) Active/Passive HS2 HA: Disallow direct connection to passive HS2 instance

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559056#comment-16559056
 ] 

Hive QA commented on HIVE-19568:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933252/HIVE-19568.04.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14813 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12884/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12884/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12884/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933252 - PreCommit-HIVE-Build

> Active/Passive HS2 HA: Disallow direct connection to passive HS2 instance
> -
>
> Key: HIVE-19568
> URL: https://issues.apache.org/jira/browse/HIVE-19568
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19568.01.patch, HIVE-19568.02.patch, 
> HIVE-19568.03.patch, HIVE-19568.04.patch, HIVE-19568.patch
>
>
> The recommended usage for clients when connecting to HS2 with Active/Passive 
> HA configuration is via ZK service discovery URL. But some applications do 
> not support ZK service discovery in which case they use direct URL to connect 
> to HS2 instance. If direct connection is to passive HS2 instance, the 
> connection should be dropped with proper error message. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18929) The method humanReadableInt in HiveStringUtils.java has a race condition.

2018-07-26 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18929:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~asherman]

> The method humanReadableInt in HiveStringUtils.java has a race condition.
> -
>
> Key: HIVE-18929
> URL: https://issues.apache.org/jira/browse/HIVE-18929
> Project: Hive
>  Issue Type: Bug
>  Components: API
>Affects Versions: 2.3.2
>Reporter: Chaiyong Ragkhitwetsagul
>Assignee: Andrew Sherman
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-18929.1.patch
>
>
> I found that the {{humanReadableInt(long number)}} method in the 
> hive/common/src/java/org/apache/hive/common/util/HiveStringUtils.java file 
> contains code which has a race condition as shown in Hadoop (issue tracking 
> ID HADOOP-9252: https://issues.apache.org/jira/browse/HADOOP-9252). The fix 
> can also be seen in the Hadoop code base.
> I couldn't find a call to the method anywhere else in the code. But it might 
> be worth to fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559052#comment-16559052
 ] 

Prasanth Jayachandran commented on HIVE-20249:
--

[~sershe] can you please take a look?

> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20249.1.patch
>
>
> NPE on deallocating buffers
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20249:
-
Description: 
NPE on deallocating buffers
{code:java}
Ignoring exception when closing input calls(cleanup). Exception 
class=java.lang.NullPointerException

java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}

  was:
This was observed on one of the old build which was digesting the exception 
root cause.
{code:java}
Ignoring exception when closing input calls(cleanup). Exception 
class=java.lang.NullPointerException

java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
 ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
 

[jira] [Updated] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20249:
-
Status: Patch Available  (was: Open)

> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20249.1.patch
>
>
> This was observed on one of the old build which was digesting the exception 
> root cause.
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20249:
-
Attachment: HIVE-20249.1.patch

> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20249.1.patch
>
>
> This was observed on one of the old build which was digesting the exception 
> root cause.
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20249) LLAP IO: NPE during refCount decrement

2018-07-26 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20249:



> LLAP IO: NPE during refCount decrement
> --
>
> Key: HIVE-20249
> URL: https://issues.apache.org/jira/browse/HIVE-20249
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> This was observed on one of the old build which was digesting the exception 
> root cause.
> {code:java}
> Ignoring exception when closing input calls(cleanup). Exception 
> class=java.lang.NullPointerException
> java.lang.NullPointerException: null
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.deallocate(BuddyAllocator.java:1355)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.deallocate(BuddyAllocator.java:685)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.releaseInitialRefcounts(EncodedReaderImpl.java:676)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:543)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:404)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:263)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:260)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:109)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20248:

Attachment: HIVE-20248.patch

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20248:

Status: Patch Available  (was: Open)

[~ekoifman] a trivial patch; can you take a look?

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-20248:
---


> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20240) Semijoin Reduction : Use local variable to check for external table condition

2018-07-26 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559036#comment-16559036
 ] 

Jason Dere commented on HIVE-20240:
---

+1

> Semijoin Reduction : Use local variable to check for external table condition
> -
>
> Key: HIVE-20240
> URL: https://issues.apache.org/jira/browse/HIVE-20240
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20240.1.patch
>
>
> This condition,
>  
> semiJoin = semiJoin && 
> !disableSemiJoinOptDueToExternalTable(parseContext.getConf(), ts, ctx);
>  
> may set semiJoin to false if an external table is encountered and will remain 
> false for subsequent cases. It should only disable it for that particular 
> case.
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14162) Allow disabling of long running job on Hive On Spark On YARN

2018-07-26 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-14162:

Attachment: HIVE-14162.4.patch

> Allow disabling of long running job on Hive On Spark On YARN
> 
>
> Key: HIVE-14162
> URL: https://issues.apache.org/jira/browse/HIVE-14162
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Thomas Scott
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-14162.1.patch, HIVE-14162.2.patch, 
> HIVE-14162.3.patch, HIVE-14162.4.patch
>
>
> Hive On Spark launches a long running process on the first query to handle 
> all queries for that user session. In some use cases this is not desired, for 
> instance when using Hue with large intervals between query executions.
> Could we have a property that would cause long running spark jobs to be 
> terminated after each query execution and started again for the next one?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Description: 
LLAP IO creates unallocated buffer objects inside the read-related data 
structures, then allocates them in bulk, then decompresses into them and 
increfs them.
If allocate or decompress steps fail, it's hard for the higher-level cleanup to 
tell what the state of the buffers in the read-related structures is - they may 
be unallocated, allocated but not incref-ed, or incref-ed.
Some cleanup paths only deal with the latter case, resulting in bugs.
Moreover, currently allocator returns partial results on such error. The 
allocation should be all-or-nothing.

This only happens on one path, others allocate and use buffers in a single 
place.


  was:
LLAP IO creates unallocated buffer objects inside the read-related data 
structures, then allocates them in bulk, then decompresses into them and 
increfs them.
If allocate or decompress steps fail, it's hard for the higher-level cleanup to 
tell what the state of the buffers in the read-related structures is - they may 
be unallocated, allocated but not incref-ed, or incref-ed.
Some cleanup paths only deal with the latter case, resulting in bugs.
Moreover, currently allocator returns partial results on such error. The 
allocation should be all-or-nothing.

This only happens on one paths, others allocate and use buffers in a single 
place.



> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one path, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559021#comment-16559021
 ] 

Sergey Shelukhin commented on HIVE-20247:
-

[~prasanth_j] can you take a look?

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat

2018-07-26 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559020#comment-16559020
 ] 

Andrew Sherman commented on HIVE-20158:
---

Thaks [~vihangk1]

> Do Not Print StackTraces to STDERR in Base64TextOutputFormat
> 
>
> Key: HIVE-20158
> URL: https://issues.apache.org/jira/browse/HIVE-20158
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Trivial
>  Labels: newbie, noob
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20158.1.patch, HIVE-20158.2.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java
> {code}
>   try {
> String signatureString = 
> job.get("base64.text.output.format.signature");
> if (signatureString != null) {
>   signature = signatureString.getBytes("UTF-8");
> } else {
>   signature = new byte[0];
> }
>   } catch (UnsupportedEncodingException e) {
> e.printStackTrace();
>   }
> {code}
> The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method 
> call.  Instead, use the {{CharSet}} version of the method and it doesn't 
> throw this explicit exception so the 'try' block can simply be removed.  
> Every JVM will support UTF-8.
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset)
> https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Attachment: (was: HIVE-20247.patch)

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Status: Patch Available  (was: Open)

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Attachment: HIVE-20247.patch

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Attachment: HIVE-20247.patch

> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20247.patch
>
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-20247:
---


> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20247) cleanup issues in LLAP IO after cache OOM

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20247:

Description: 
LLAP IO creates unallocated buffer objects inside the read-related data 
structures, then allocates them in bulk, then decompresses into them and 
increfs them.
If allocate or decompress steps fail, it's hard for the higher-level cleanup to 
tell what the state of the buffers in the read-related structures is - they may 
be unallocated, allocated but not incref-ed, or incref-ed.
Some cleanup paths only deal with the latter case, resulting in bugs.
Moreover, currently allocator returns partial results on such error. The 
allocation should be all-or-nothing.

This only happens on one paths, others allocate and use buffers in a single 
place.


  was:
LLAP IO creates unallocated buffer objects inside the read-related data 
structures, then allocates them in bulk, then decompresses into them and 
increfs them.
If allocate or decompress steps fail, it's hard for the higher-level cleanup to 
tell what the state of the buffers in the read-related structures is - they may 
be unallocated, allocated but not incref-ed, or incref-ed.
Some cleanup paths only deal with the latter case, resulting in bugs.

This only happens on one paths, others allocate and use buffers in a single 
place.



> cleanup issues in LLAP IO after cache OOM
> -
>
> Key: HIVE-20247
> URL: https://issues.apache.org/jira/browse/HIVE-20247
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
>Priority: Major
>
> LLAP IO creates unallocated buffer objects inside the read-related data 
> structures, then allocates them in bulk, then decompresses into them and 
> increfs them.
> If allocate or decompress steps fail, it's hard for the higher-level cleanup 
> to tell what the state of the buffers in the read-related structures is - 
> they may be unallocated, allocated but not incref-ed, or incref-ed.
> Some cleanup paths only deal with the latter case, resulting in bugs.
> Moreover, currently allocator returns partial results on such error. The 
> allocation should be all-or-nothing.
> This only happens on one paths, others allocate and use buffers in a single 
> place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20169) Print Final Rows Processed in MapOperator

2018-07-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559010#comment-16559010
 ] 

Vihang Karajgaonkar commented on HIVE-20169:


+1

> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch, 
> HIVE-20169.3.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559008#comment-16559008
 ] 

Jesus Camacho Rodriguez commented on HIVE-20213:


[~ashutoshc], could you take a look? I can upload the part of the patch that 
corresponds to HIVE-20213 to RB if you prefer. Thanks

> Upgrade Calcite to 1.17.0
> -
>
> Key: HIVE-20213
> URL: https://issues.apache.org/jira/browse/HIVE-20213
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20213.01.patch, HIVE-20213.02.patch, 
> HIVE-20213.03.patch, HIVE-20213.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20168) ReduceSinkOperator Logging Hidden

2018-07-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559007#comment-16559007
 ] 

Vihang Karajgaonkar commented on HIVE-20168:


+1

> ReduceSinkOperator Logging Hidden
> -
>
> Key: HIVE-20168
> URL: https://issues.apache.org/jira/browse/HIVE-20168
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20168.1.patch, HIVE-20168.2.patch, 
> HIVE-20168.3.patch
>
>
> [https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java]
>  
> {code:java}
> if (LOG.isTraceEnabled()) {
>   if (numRows == cntr) {
> cntr = logEveryNRows == 0 ? cntr * 10 : numRows + logEveryNRows;
> if (cntr < 0 || numRows < 0) {
>   cntr = 0;
>   numRows = 1;
> }
> LOG.info(toString() + ": records written - " + numRows);
>   }
> }
> ...
> if (LOG.isTraceEnabled()) {
>   LOG.info(toString() + ": records written - " + numRows);
> }
> {code}
> There are logging guards here checking for TRACE level debugging but the 
> logging is actually INFO.  This is important logging for detecting data skew. 
>  Please change guards to check for INFO... or I would prefer that the guards 
> are removed altogether since it's very rare that a service is running with 
> only WARN level logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20213:
---
Attachment: HIVE-20213.03.patch

> Upgrade Calcite to 1.17.0
> -
>
> Key: HIVE-20213
> URL: https://issues.apache.org/jira/browse/HIVE-20213
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20213.01.patch, HIVE-20213.02.patch, 
> HIVE-20213.03.patch, HIVE-20213.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20213) Upgrade Calcite to 1.17.0

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559001#comment-16559001
 ] 

Hive QA commented on HIVE-20213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933245/HIVE-20213.02.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14813 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout (batchId=316)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12883/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12883/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12883/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933245 - PreCommit-HIVE-Build

> Upgrade Calcite to 1.17.0
> -
>
> Key: HIVE-20213
> URL: https://issues.apache.org/jira/browse/HIVE-20213
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20213.01.patch, HIVE-20213.02.patch, 
> HIVE-20213.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20246) Make some collect stats flags be user configurable

2018-07-26 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan reassigned HIVE-20246:



> Make some collect stats flags be user configurable
> --
>
> Key: HIVE-20246
> URL: https://issues.apache.org/jira/browse/HIVE-20246
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Alice Fan
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 4.0.0
>
>
> When hive.stats.autogather=true then the Metastore lists all files under the 
> table directory to populate basic stats like file counts and sizes. This file 
> listing operation can be very expensive particularly on filesystems like S3.
> One way to address this issue is to reconfigure hive.stats.autogather=false.
> However, set metaconf:hive.stats.autogather=false will not be taken by 
> HiveMetaStore when user set this in session.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20174) Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions

2018-07-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558991#comment-16558991
 ] 

Vihang Karajgaonkar commented on HIVE-20174:


I see. The randowRowSource tests have been really great in flushing out these 
bugs. It generally much harder to write queries to execute such corner cases.

> Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation 
> Functions
> 
>
> Key: HIVE-20174
> URL: https://issues.apache.org/jira/browse/HIVE-20174
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-20174.01.patch, HIVE-20174.02.patch, 
> HIVE-20174.03.patch, HIVE-20174.04.patch, HIVE-20174.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized aggregation functions.
>  
> BUGs found:
> 1) AVG/VARIANCE (family) in PARTIAL1 mode was returning NULL instead of count 
> = 0, sum = 0 (All data types).  For AVG DECIMAL, only return NULL if there 
> was an overflow.
> 2) AVG/MIN/MAX was not detecting repeated NULL correctly for the TIMESTAMP, 
> INTERVAL_DAY_TIME, and String Family.  Eliminated redundant code.
> 3) Fix incorrect calculation  for VARIANCE (family) in PARTIAL2 and FINAL 
> modes (HIVE-18758).
> 4) Fix row-mode AVG DECIMAL to enforce output type precision and scale in 
> COMPLETE and FINAL modes.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat

2018-07-26 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20158:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Patch merged into branch-3 and master. Thanks for your contribution [~asherman]

> Do Not Print StackTraces to STDERR in Base64TextOutputFormat
> 
>
> Key: HIVE-20158
> URL: https://issues.apache.org/jira/browse/HIVE-20158
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Trivial
>  Labels: newbie, noob
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-20158.1.patch, HIVE-20158.2.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java
> {code}
>   try {
> String signatureString = 
> job.get("base64.text.output.format.signature");
> if (signatureString != null) {
>   signature = signatureString.getBytes("UTF-8");
> } else {
>   signature = new byte[0];
> }
>   } catch (UnsupportedEncodingException e) {
> e.printStackTrace();
>   }
> {code}
> The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method 
> call.  Instead, use the {{CharSet}} version of the method and it doesn't 
> throw this explicit exception so the 'try' block can simply be removed.  
> Every JVM will support UTF-8.
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset)
> https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20174) Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions

2018-07-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558956#comment-16558956
 ] 

Matt McCline commented on HIVE-20174:
-

The new Aggregation Unit Tests drive a set of random rows through row-mode 
GenericUDAF*Evaluator* classes and through the vector-mode VectorUDAF* classes. 
 The random rows are fixed up to make sure interesting batches are created with 
repeating values and repeating NULLs (I think I mentioned this in an earlier 
JIRA).

So, I haven't been looking at formulating queries.  I have found driving random 
data against all data types and all aggregation functions to be so much more 
fruitful than trying to write queries.  I discovered to my surprise that some 
of the VectorUDAF* were maintaining a isNull flag and using it to output NULLs 
when the GenericUDAF*Evaluator* were not doing that.

> Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation 
> Functions
> 
>
> Key: HIVE-20174
> URL: https://issues.apache.org/jira/browse/HIVE-20174
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-20174.01.patch, HIVE-20174.02.patch, 
> HIVE-20174.03.patch, HIVE-20174.04.patch, HIVE-20174.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized aggregation functions.
>  
> BUGs found:
> 1) AVG/VARIANCE (family) in PARTIAL1 mode was returning NULL instead of count 
> = 0, sum = 0 (All data types).  For AVG DECIMAL, only return NULL if there 
> was an overflow.
> 2) AVG/MIN/MAX was not detecting repeated NULL correctly for the TIMESTAMP, 
> INTERVAL_DAY_TIME, and String Family.  Eliminated redundant code.
> 3) Fix incorrect calculation  for VARIANCE (family) in PARTIAL2 and FINAL 
> modes (HIVE-18758).
> 4) Fix row-mode AVG DECIMAL to enforce output type precision and scale in 
> COMPLETE and FINAL modes.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558917#comment-16558917
 ] 

Gopal V commented on HIVE-20153:


LGTM - +1 tests pending.

This extra field is still taking up meaningful amounts of memory for the 
objects in the heap. 

>From JOL.

{code}
* 64-bit VM: **
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum$GenericUDAFSumEvaluator$SumAgg
 object internals:
 OFFSET  SIZETYPE DESCRIPTION   
VALUE
  016 (object header)   N/A
 16 1 boolean SumAgg.empty  N/A
 17 7 (alignment/padding gap)  
 24 8java.lang.Object SumAgg.sumN/A
 32 8   java.util.HashSet SumAgg.uniqueObjects  N/A
Instance size: 40 bytes
Space losses: 7 bytes internal + 0 bytes external = 7 bytes total
...
* 64-bit VM, compressed references enabled: ***
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum$GenericUDAFSumEvaluator$SumAgg
 object internals:
 OFFSET  SIZETYPE DESCRIPTION   
VALUE
  012 (object header)   N/A
 12 1 boolean SumAgg.empty  N/A
 13 3 (alignment/padding gap)  
 16 4java.lang.Object SumAgg.sumN/A
 20 4   java.util.HashSet SumAgg.uniqueObjects  N/A
Instance size: 24 bytes
Space losses: 3 bytes internal + 0 bytes external = 3 bytes total
{code}

a PTF specific sub-class would remove that part & let me think of a way of 
having a SumAggEmpty class (the "which class is it" goes into the 12 byte obj 
header).

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-20153.1.patch, Screen Shot 2018-07-12 at 6.41.28 
> PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20212) Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558911#comment-16558911
 ] 

Hive QA commented on HIVE-20212:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933246/HIVE-20212.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14812 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking]
 (batchId=193)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12882/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12882/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12882/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933246 - PreCommit-HIVE-Build

> Hiveserver2 in http mode emitting metric default.General.open_connections 
> incorrectly
> -
>
> Key: HIVE-20212
> URL: https://issues.apache.org/jira/browse/HIVE-20212
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-20212.01.patch, HIVE-20212.01.patch, 
> HIVE-20212.02.patch, HIVE-20212.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20212) Hiveserver2 in http mode emitting metric default.General.open_connections incorrectly

2018-07-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558908#comment-16558908
 ] 

Hive QA commented on HIVE-20212:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 51m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12882/dev-support/hive-personality.sh
 |
| git revision | master / 2d097dc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: service U: service |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12882/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hiveserver2 in http mode emitting metric default.General.open_connections 
> incorrectly
> -
>
> Key: HIVE-20212
> URL: https://issues.apache.org/jira/browse/HIVE-20212
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-20212.01.patch, HIVE-20212.01.patch, 
> HIVE-20212.02.patch, HIVE-20212.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20174) Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions

2018-07-26 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558890#comment-16558890
 ] 

Vihang Karajgaonkar commented on HIVE-20174:


Hi [~mmccline] Thanks for reporting and fixing this issue. Can you provide a 
example query when the groupby operator mode would be partial1 and we can 
reproduce this problem? I tried a few queries but the mode always seems to be 
{{hash}} and the results of vectorized vs non-vectorized executions matched.

> Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation 
> Functions
> 
>
> Key: HIVE-20174
> URL: https://issues.apache.org/jira/browse/HIVE-20174
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-20174.01.patch, HIVE-20174.02.patch, 
> HIVE-20174.03.patch, HIVE-20174.04.patch, HIVE-20174.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized aggregation functions.
>  
> BUGs found:
> 1) AVG/VARIANCE (family) in PARTIAL1 mode was returning NULL instead of count 
> = 0, sum = 0 (All data types).  For AVG DECIMAL, only return NULL if there 
> was an overflow.
> 2) AVG/MIN/MAX was not detecting repeated NULL correctly for the TIMESTAMP, 
> INTERVAL_DAY_TIME, and String Family.  Eliminated redundant code.
> 3) Fix incorrect calculation  for VARIANCE (family) in PARTIAL2 and FINAL 
> modes (HIVE-18758).
> 4) Fix row-mode AVG DECIMAL to enforce output type precision and scale in 
> COMPLETE and FINAL modes.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-12530) Merge join in mutiple subsquence join and a mapjoin in it in mr model

2018-07-26 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-12530:
---
Description: 
sample hql:
{code:sql}
select  A.state_date, 
   A.customer, 
   A.channel_2,
   A.id,
   A.pid,
   A.type,
   A.pv,
   A.uv,
   A.visits,
   if(C.stay_visits is null,0,C.stay_visits) as stay_visits,
   A.stay_time,
   if(B.bounce is null,0,B.bounce) as bounce
 from
 (select a.state_date, 
a.customer, 
b.url as channel_2,
b.id,
b.pid,
b.type,
count(1) as pv,
count(distinct a.gid) uv,
count(distinct a.session_id) as visits,
sum(a.stay_time) as stay_time
   from   
   ( select state_date, 
   customer, 
   gid,
   session_id,
   ep,
   stay_time
from bdi_fact.mid_pageview_dt0
where l_date ='$v_date'
  )a
  join
  (select l_date as state_date ,
  url,
  id,
  pid,
  type,
  cid
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
  )b
   on  a.customer=b.cid  
   where a.ep  rlike b.url
   group by a.state_date, a.customer, b.url,b.id,b.pid,b.type
   )A
  
left outer join
   (   select 
   c.state_date ,
   c.customer ,
   d.url as channel_2,
   d.id,
   sum(pagedepth) as bounce
from
  ( select 
  t1.state_date ,
  t1.customer ,
  t1.session_id,
  t1.ep,
  t2.pagedepth
from   
 ( select 
 state_date ,
 customer ,
 session_id,
 exit_url as ep
  from ods.mid_session_enter_exit_dt0
  where l_date ='$v_date'
  )t1
 join
  ( select 
state_date ,
customer ,
session_id,
pagedepth
from ods.mid_session_action_dt0
where l_date ='$v_date'
and  pagedepth='1'
  )t2
 on t1.customer=t2.customer
 and t1.session_id=t2.session_id
   )c
   join
   (select *
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
   )d
   on c.customer=d.cid
   where c.ep  rlike d.url
   group by  c.state_date,c.customer,d.url,d.id
 )B
 on 
 A.customer=B.customer
 and A.channel_2=B.channel_2 
 and A.id=B.id
  left outer join
 ( 
 select e.state_date, 
e.customer, 
f.url as channel_2,
f.id,
f.pid,
f.type,
count(distinct e.session_id) as stay_visits
   from   
   ( select state_date, 
   customer, 
   gid,
   session_id,
   ep,
   stay_time
from bdi_fact.mid_pageview_dt0
where l_date ='$v_date'
  )e
  join
  (select l_date as state_date,
  url,
  id,
  pid,
  type,
  cid
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
  )f
   on  e.customer=f.cid  
   where e.ep  rlike f.url
   and e.stay_time is not null
   and e.stay_time <>'0'
   group by e.state_date, e.customer, 

[jira] [Commented] (HIVE-20226) HMS getNextNotification will throw exception when request maxEvents exceed table's max_rows

2018-07-26 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558877#comment-16558877
 ] 

Yongzhi Chen commented on HIVE-20226:
-

patch 3 looks good +1

> HMS getNextNotification will throw exception when request maxEvents exceed 
> table's max_rows
> ---
>
> Key: HIVE-20226
> URL: https://issues.apache.org/jira/browse/HIVE-20226
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Alice Fan
>Assignee: Alice Fan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20226.1.patch, HIVE-20226.2.patch, 
> HIVE-20226.3.patch
>
>
> When Sentry call getNextNotification(), its request max_events will always be 
> Integer.MAX_VALUE.
> query.setRange(0, maxEvents) of ObjectStore will throw exception when 
> maxEvents exceed table's max_rows.
> Ex: assumed max_rows =  50,000,000 in a mysql table
> java.sql.SQLException: setMaxRows() out of range. 2147483647 > 5000.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20239:

Status: Patch Available  (was: Open)

> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20239:

Attachment: HIVE-20239.1.patch

> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20239:

Status: Open  (was: Patch Available)

> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20239) Do Not Print StackTraces to STDERR in MapJoinProcessor

2018-07-26 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20239:

Attachment: (was: HIVE-20239.1.patch)

> Do Not Print StackTraces to STDERR in MapJoinProcessor
> --
>
> Key: HIVE-20239
> URL: https://issues.apache.org/jira/browse/HIVE-20239
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Fix For: 4.0.0
>
> Attachments: HIVE-20239.1.patch
>
>
> {code:java|title=MapJoinProcessor.java}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new SemanticException("Failed to generate new mapJoin operator " +
>   "by exception : " + e.getMessage());
> }
> {code}
> Please change to... something like...
> {code}
> } catch (Exception e) {
>   throw new SemanticException("Failed to generate new mapJoin operator", 
> e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20244) forward port HIVE-19704 to master

2018-07-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20244:

Attachment: HIVE-20244.01.patch

> forward port HIVE-19704 to master
> -
>
> Key: HIVE-20244
> URL: https://issues.apache.org/jira/browse/HIVE-20244
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20244.01.patch, HIVE-20244.patch
>
>
> Apparently this logic is still there and can be engaged in some cases, like 
> when one file takes the entire cache from a single large read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >