[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-09-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=480525&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480525
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 09/Sep/20 00:47
Start Date: 09/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1012:
URL: https://github.com/apache/hive/pull/1012


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 480525)
Time Spent: 1.5h  (was: 1h 20m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-08-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=476945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476945
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1012:
URL: https://github.com/apache/hive/pull/1012#issuecomment-684124264


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476945)
Time Spent: 1h 20m  (was: 1h 10m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=439141&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-439141
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 30/May/20 19:40
Start Date: 30/May/20 19:40
Worklog Time Spent: 10m 
  Work Description: pgaref closed pull request #1013:
URL: https://github.com/apache/hive/pull/1013


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 439141)
Time Spent: 1h 10m  (was: 1h)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=436875&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436875
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 24/May/20 11:24
Start Date: 24/May/20 11:24
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r426288840



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   @prasanthj thanks for fixing this! Patch looks good and is now committed!
   At some point I would also change the first comment of the method to clarify 
that a task that is both Guaranteed and Finishable should never be in the 
preemption queue. 
   
https://github.com/apache/hive/pull/1012/files#diff-16658bf15468ecd089c4fd32e75fa8b2R876
   
   I believe there is value in documenting and describing how the scheduler 
works (started keeping some noted but happy to add more work). I personally 
find the information in this area of the project very limited.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436875)
Time Spent: 1h  (was: 50m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=434227&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434227
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 17/May/20 18:09
Start Date: 17/May/20 18:09
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r426288840



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   @prasanthj thanks for fixing this! Patch looks good and is now committed!
   At some point I would also change the first comment of the method to clarify 
that a task that is both Guaranteed and Finishable should never be in the 
preemption queue. 
   
https://github.com/apache/hive/pull/1012/files#diff-16658bf15468ecd089c4fd32e75fa8b2R876
   
   I am believe there is value to put some effort documenting and describing 
how the scheduler works (started keeping some noted but happy to add more 
work). I personally find the information in this area of the project very 
limited.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 434227)
Time Spent: 50m  (was: 40m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=433984&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433984
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 16/May/20 01:19
Start Date: 16/May/20 01:19
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r426102275



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   Non-finishable -> Finishable does not have to be pre-emption queue. This 
could be in wait queue (if not capacity) or taken by executor to run both of 
which are fine.
   You brought up good point, we may be adding the same task fragment to 
pre-emption queue twice. I will add a "if not exists" check when adding to 
pre-emption queue. 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433984)
Time Spent: 40m  (was: 0.5h)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=433983&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433983
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 16/May/20 01:18
Start Date: 16/May/20 01:18
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r426102275



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   Non-finishable -> Finishable does not have to be pre-emption queue. This 
could be in wait queue (if not capacity) or taken by executor to run both of 
which are fine.
   You brought up good point, we may be adding the same task fragment to 
pre-emption queue twice. I will add a if not exists check when adding to 
pre-emption queue. 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433983)
Time Spent: 0.5h  (was: 20m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=433982&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433982
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 16/May/20 01:10
Start Date: 16/May/20 01:10
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r426102275



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   Non-finishable -> Finishable does not have to be pre-emption queue. This 
could be in wait queue (if not capacity) or taken by executor to run both which 
are fine.
   You brought up good point, we may be adding the same task fragment to 
pre-emption queue twice. I will add a if not exists check when adding to 
pre-emption queue. 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433982)
Time Spent: 20m  (was: 10m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=433819&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433819
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 15/May/20 18:11
Start Date: 15/May/20 18:11
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1012:
URL: https://github.com/apache/hive/pull/1012#discussion_r425967771



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java
##
@@ -884,10 +885,20 @@ private void finishableStateUpdated(TaskWrapper 
taskWrapper, boolean newFinishab
 taskWrapper.updateCanFinishForPriority(newFinishableState);
 forceReinsertIntoQueue(taskWrapper, isRemoved);
   } else {
-taskWrapper.updateCanFinishForPriority(newFinishableState);
-if (!newFinishableState && !taskWrapper.isInPreemptionQueue()) {
-  // No need to check guaranteed here; if it was false we would 
already be in the queue.
+// if speculative task, any finishable state change should re-order 
the queue as speculative tasks are always
+// not-guaranteed (re-order helps put non-finishable's ahead of 
finishable)
+if (!taskWrapper.isGuaranteed()) {
+  removeFromPreemptionQueue(taskWrapper);
+  taskWrapper.updateCanFinishForPriority(newFinishableState);
   addToPreemptionQueue(taskWrapper);
+} else {
+  // if guaranteed task, if the finishable state changed to 
non-finishable and if the task doesn't exist
+  // pre-emption queue, then add it so that it becomes candidate to 
kill
+  taskWrapper.updateCanFinishForPriority(newFinishableState);

Review comment:
   Can there be a case where we have a Guaranteed task that changes from 
non-finishable to finishable and is only part of the preemptionQueue?
   Under that scenario our code (and the old code) would remove it from 
preemptionQ and it would not be part of any other Q.
   
   From the code below it seems that this can indeed happen:
   
https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L776





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 433819)
Remaining Estimate: 0h
Time Spent: 10m

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)