[jira] [Updated] (HIVE-11660) LLAP: TestTaskExecutorService is flaky

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11660:

Summary: LLAP: TestTaskExecutorService is flaky  (was: LLAP: 
TestTaskExecutorService is flaky again)

> LLAP: TestTaskExecutorService is flaky
> --
>
> Key: HIVE-11660
> URL: https://issues.apache.org/jira/browse/HIVE-11660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> java.lang.Exception: test timed out after 1 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.awaitCompletion(TestTaskExecutorService.java:244)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.access$000(TestTaskExecutorService.java:208)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption(TestTaskExecutorService.java:168)
> {noformat}
> Cannot repro locally. See HIVE-11642



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11660) LLAP: TestTaskExecutorService is flaky

2015-08-28 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11660:
--
Attachment: HIVE-11660.1.txt

Attaching patch to fix the tests. Have run 100 iterations of both on a Linux 
box - where the failures are normally seen - with all of them passing.

There's some real bugs which were causing TestLlapTaskSchedulerService to fail. 
The last allocateTaskRequest for a dag could've ended up being ignored.
Also in TaskScheduler, the waitQueue can be improved - filed a separate jira 
for this.

[~sershe] - please review.

> LLAP: TestTaskExecutorService is flaky
> --
>
> Key: HIVE-11660
> URL: https://issues.apache.org/jira/browse/HIVE-11660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: HIVE-11660.1.txt
>
>
> {noformat}
> java.lang.Exception: test timed out after 1 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.awaitCompletion(TestTaskExecutorService.java:244)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.access$000(TestTaskExecutorService.java:208)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption(TestTaskExecutorService.java:168)
> {noformat}
> Cannot repro locally. See HIVE-11642



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11660) LLAP: TestTaskExecutorService is flaky

2015-08-31 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11660:
--
Attachment: HIVE-11660.2.txt

Updated patch to convert some of the AtomicBooleans to booleans.

> LLAP: TestTaskExecutorService is flaky
> --
>
> Key: HIVE-11660
> URL: https://issues.apache.org/jira/browse/HIVE-11660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: HIVE-11660.1.txt, HIVE-11660.2.txt
>
>
> {noformat}
> java.lang.Exception: test timed out after 1 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.awaitCompletion(TestTaskExecutorService.java:244)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.access$000(TestTaskExecutorService.java:208)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption(TestTaskExecutorService.java:168)
> {noformat}
> Cannot repro locally. See HIVE-11642



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)