[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843369#comment-15843369
 ] 

Hive QA commented on HIVE-15693:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12849667/HIVE-15693.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11003 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=93)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3220/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3220/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3220/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12849667 - PreCommit-HIVE-Build

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch, HIVE-15693.2.patch, 
> HIVE-15693.3.patch, HIVE-15693.4.patch, HIVE-15693.5.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15842613#comment-15842613
 ] 

Hive QA commented on HIVE-15693:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12849667/HIVE-15693.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11003 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=93)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3212/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3212/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3212/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12849667 - PreCommit-HIVE-Build

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch, HIVE-15693.2.patch, 
> HIVE-15693.3.patch, HIVE-15693.4.patch, HIVE-15693.5.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15842416#comment-15842416
 ] 

Lefty Leverenz commented on HIVE-15693:
---

Config review:  The parameter description should have newlines (\n) just like 
the previous parameter's description, to avoid overlong lines in the generated 
template file hive-default.xml.template.

Also a couple of nits:  The second line of the description doesn't need extra 
indentation.  And you could add a period at the end.

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch, HIVE-15693.2.patch, 
> HIVE-15693.3.patch, HIVE-15693.4.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840758#comment-15840758
 ] 

Siddharth Seth commented on HIVE-15693:
---

+1. If we can override the maxThreads (based on numExecutors) - I think that 
should be mentioned in the description of the property before committing.

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch, HIVE-15693.2.patch, 
> HIVE-15693.3.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840626#comment-15840626
 ] 

Siddharth Seth commented on HIVE-15693:
---

Maybe we can have -1/0 as value where we auto determine the thread count, and 
any other value being an override.

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15693) LLAP: cached threadpool in AMReporter creates too many threads leading to OOM

2017-01-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839399#comment-15839399
 ] 

Siddharth Seth commented on HIVE-15693:
---

Instead of 2x executors - think this needs to be based on the concurrency. New 
config parameter to set an upper bound? Lower bound to number of executors?
Killed attempts is more likely to be based on number of AMs communicating, 
rather than the number of executors in the daemon.

Eventually, I think we need to have a certain number of threads per AM - and 
also ensure that all threads don't end up blocking because of one bad AM. I'll 
create a follow up jira for this.

> LLAP: cached threadpool in AMReporter creates too many threads leading to OOM
> -
>
> Key: HIVE-15693
> URL: https://issues.apache.org/jira/browse/HIVE-15693
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-15693.1.patch
>
>
> branch: master
> {noformat}
> 2017-01-22T19:52:42,470 WARN  [IPC Server handler 3 on 34642 ()] 
> org.apache.hadoop.ipc.Server: IPC Server handler 3 on 34642, call 
> org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
> ...Call#17257 Retry#0
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method) ~[?:1.8.0_77]
> at java.lang.Thread.start(Thread.java:714) [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>  ~[?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) 
> ~[?:1.8.0_77]
> at 
> com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:480)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.AMReporter.taskKilled(AMReporter.java:231)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl$KilledTaskHandlerImpl.taskKilled(ContainerRunnerImpl.java:501)
>  ~[hive-llap-server-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)