[jira] [Created] (YARN-10851) Tez session close does not interrupt yarn's async thread
Qihong Wu created YARN-10851: Summary: Tez session close does not interrupt yarn's async thread Key: YARN-10851 URL: https://issues.apache.org/jira/browse/YARN-10851 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.10.1, 2.8.5 Environment: On an HA cluster, where RM1 is not the active RM Yarn of version 2.8.5 and is configured with Tez Reporter: Qihong Wu Attachments: hive.log Hi, I want to ask for the expertise knowledge on the yarn behavior when handling `InterruptedIOException`. The issue occurs on a HA cluster, where RM1 is NOT the active RM. Therefore, if the yarn request made to RM1 failed, the RM failover should happen. However, if an interrupted exception is thrown when connecting to RM1, the thread should try to [bail out|https://dzone.com/articles/how-to-handle-the-interruptedexception] as soon as possible to [respect interrupt request|https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ExecutorService.html#shutdownNow--], rather than moving on to another RM. But I found my application (hive) after throwing `InterruptedIOException` when trying to connect with RM1 failed, continuing to RM2. I want to know how does yarn handle InterruptedIOException, shouldn't the async thread gets interrupted and shutdown when tez close() triggered interrupt request? *The reproduction step is:* 1. In an HA cluster which uses yarn of version 2.8.5 and is configured with Tez 2. Make sure RM1 is not the active RM by checking `yarn rmadmin -getAllServiceState`. It it is, manually [transition RM2 as active RM|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html#Admin_commands]. 3. Apply failover-retry properties to yarn-site.xml {quote} yarn.client.failover-retries 4 yarn.client.failover-retries-on-socket-timeouts 4 yarn.client.failover-max-attempts 4 {quote} 4. Run a simple application to yarn-client (for example, a simple hive DDL command) {quote}hive --hiveconf hive.root.logger=TRACE,console -e "create table tez_test (id int, name string);" {quote} 5. Find from application's log (for example, hive.log), you can find `RetryInvocationHandler` has captured the `InterruptedIOException` when request was talking over rm1, but the thread didn't bail out immediately, but continue moving to rm2. *More information:* The interrupted exception is triggered via via [TezSessionState#close|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java#L689] and [Future#cancel|https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Future.html#cancel-boolean-]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10849) Clarify testcase documentation for TestServiceAM#testContainersReleasedWhenPreLaunchFails
[ https://issues.apache.org/jira/browse/YARN-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17376630#comment-17376630 ] Peter Bacsko commented on YARN-10849: - [~snemeth] patch v2 seems to undo what v1 introduced. > Clarify testcase documentation for > TestServiceAM#testContainersReleasedWhenPreLaunchFails > - > > Key: YARN-10849 > URL: https://issues.apache.org/jira/browse/YARN-10849 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10849.001.patch, YARN-10849.002.patch > > > There's a small comment added to testcase: > org.apache.hadoop.yarn.service.TestServiceAM#testContainersReleasedWhenPreLaunchFails: > {code} > // Test to verify that the containers are released and the > // component instance is added to the pending queue when building the launch > // context fails. > {code} > However, it was not clear for me why the "launch context" would fail. > While the test passes, it throws an Exception that tells the story. > {code} > 2021-07-06 18:31:04,438 ERROR [pool-275-thread-1] > containerlaunch.ContainerLaunchService (ContainerLaunchService.java:run(122)) > - [COMPINSTANCE compa-0 : container_1625589063422_0001_01_01]: Failed to > launch container. > java.lang.IllegalArgumentException: Can not create a Path from a null string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:164) > at org.apache.hadoop.fs.Path.(Path.java:180) > at > org.apache.hadoop.yarn.service.provider.tarball.TarballProviderService.processArtifact(TarballProviderService.java:39) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:144) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is thrown because the id of the Artifact object is unset > (null) and TarballProviderService.processArtifact verifies it and it does not > allow such artifacts. > The aim of this jira is to add a clarification comment or javadoc to this > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10850) TimelineService v2 lists containers for all attempts when filtering for one
Benjamin Teke created YARN-10850: Summary: TimelineService v2 lists containers for all attempts when filtering for one Key: YARN-10850 URL: https://issues.apache.org/jira/browse/YARN-10850 Project: Hadoop YARN Issue Type: Bug Components: timelinereader Reporter: Benjamin Teke When using the command {code:java} yarn container -list {code} with an application attempt ID based on the help only the containers for that attempt should be listed. {code:java} -list List containers for application attempt when application attempt ID is provided. When application name is provided, then it finds the instances of the application based on app's own implementation, and -appTypes option must be specified unless it is the default yarn-service type. With app name, it supports optional use of -version to filter instances based on app version, -components to filter instances based on component names, -states to filter instances based on instance state. {code} When TimelineService v2 is enabled all of the containers for the application are returned. {code:java} hrt_qa@ctr-e172-1620330694487-146061-01-02:/hwqe/hadoopqe$ yarn applicationattempt -list application_1625124233002_0007 21/07/01 09:32:23 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=http://ctr-e172-1620330694487-146061-01-04.hwx.site:8198/ws/v2/timeline/, clusterId=yarn-cluster 21/07/01 09:32:24 INFO client.AHSProxy: Connecting to Application History server at ctr-e172-1620330694487-146061-01-04.hwx.site/172.27.113.4:10200 21/07/01 09:32:24 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 Total number of application attempts :2 ApplicationAttempt-Id State AM-Container-IdTracking-URL appattempt_1625124233002_0007_01 FAILED container_e43_1625124233002_0007_01_01 http://ctr-e172-1620330694487-146061-01-03.hwx.site:8088/proxy/application_1625124233002_0007/ appattempt_1625124233002_0007_02 KILLED container_e43_1625124233002_0007_02_01 http://ctr-e172-1620330694487-146061-01-03.hwx.site:8088/proxy/application_1625124233002_0007/ {code} Querying the 2 app attempts produces the same output: {code:java} hrt_qa@ctr-e172-1620330694487-146061-01-02:/hwqe/hadoopqe$ yarn container -list appattempt_1625124233002_0007_01 21/07/01 09:32:35 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=http://ctr-e172-1620330694487-146061-01-04.hwx.site:8198/ws/v2/timeline/, clusterId=yarn-cluster 21/07/01 09:32:35 INFO client.AHSProxy: Connecting to Application History server at ctr-e172-1620330694487-146061-01-04.hwx.site/172.27.113.4:10200 21/07/01 09:32:35 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 21/07/01 09:32:36 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/7.1.7.0-504/0/resource-types.xml Total number of containers :12 Container-IdStart Time Finish Time StateHost Node Http Address LOG-URL container_e43_1625124233002_0007_02_04 N/A N/ACOMPLETE ctr-e172-1620330694487-146061-01-02.hwx.site:25454 ctr-e172-1620330694487-146061-01-02.hwx.site:8042 http://ctr-e172-1620330694487-146061-01-04.hwx.site:19888/jobhistory/logs/logs/ctr-e172-1620330694487-146061-01-02.hwx.site:25454/container_e43_1625124233002_0007_02_04/container_e43_1625124233002_0007_02_04/hrt_qa container_e43_1625124233002_0007_02_05 N/A N/ACOMPLETE ctr-e172-1620330694487-146061-01-07.hwx.site:25454 ctr-e172-1620330694487-146061-01-07.hwx.site:8042 http://ctr-e172-1620330694487-146061-01-04.hwx.site:19888/jobhistory/logs/logs/ctr-e172-1620330694487-146061-01-07.hwx.site:25454/container_e43_1625124233002_0007_02_05/container_e43_1625124233002_0007_02_05/hrt_qa container_e43_1625124233002_0007_02_03
[jira] [Updated] (YARN-10849) Clarify testcase documentation for TestServiceAM#testContainersReleasedWhenPreLaunchFails
[ https://issues.apache.org/jira/browse/YARN-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10849: -- Attachment: YARN-10849.002.patch > Clarify testcase documentation for > TestServiceAM#testContainersReleasedWhenPreLaunchFails > - > > Key: YARN-10849 > URL: https://issues.apache.org/jira/browse/YARN-10849 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10849.001.patch, YARN-10849.002.patch > > > There's a small comment added to testcase: > org.apache.hadoop.yarn.service.TestServiceAM#testContainersReleasedWhenPreLaunchFails: > {code} > // Test to verify that the containers are released and the > // component instance is added to the pending queue when building the launch > // context fails. > {code} > However, it was not clear for me why the "launch context" would fail. > While the test passes, it throws an Exception that tells the story. > {code} > 2021-07-06 18:31:04,438 ERROR [pool-275-thread-1] > containerlaunch.ContainerLaunchService (ContainerLaunchService.java:run(122)) > - [COMPINSTANCE compa-0 : container_1625589063422_0001_01_01]: Failed to > launch container. > java.lang.IllegalArgumentException: Can not create a Path from a null string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:164) > at org.apache.hadoop.fs.Path.(Path.java:180) > at > org.apache.hadoop.yarn.service.provider.tarball.TarballProviderService.processArtifact(TarballProviderService.java:39) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:144) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is thrown because the id of the Artifact object is unset > (null) and TarballProviderService.processArtifact verifies it and it does not > allow such artifacts. > The aim of this jira is to add a clarification comment or javadoc to this > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10849) Clarify testcase documentation for TestServiceAM#testContainersReleasedWhenPreLaunchFails
[ https://issues.apache.org/jira/browse/YARN-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10849: -- Attachment: YARN-10849.001.patch > Clarify testcase documentation for > TestServiceAM#testContainersReleasedWhenPreLaunchFails > - > > Key: YARN-10849 > URL: https://issues.apache.org/jira/browse/YARN-10849 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10849.001.patch > > > There's a small comment added to testcase: > org.apache.hadoop.yarn.service.TestServiceAM#testContainersReleasedWhenPreLaunchFails: > {code} > // Test to verify that the containers are released and the > // component instance is added to the pending queue when building the launch > // context fails. > {code} > However, it was not clear for me why the "launch context" would fail. > While the test passes, it throws an Exception that tells the story. > {code} > 2021-07-06 18:31:04,438 ERROR [pool-275-thread-1] > containerlaunch.ContainerLaunchService (ContainerLaunchService.java:run(122)) > - [COMPINSTANCE compa-0 : container_1625589063422_0001_01_01]: Failed to > launch container. > java.lang.IllegalArgumentException: Can not create a Path from a null string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:164) > at org.apache.hadoop.fs.Path.(Path.java:180) > at > org.apache.hadoop.yarn.service.provider.tarball.TarballProviderService.processArtifact(TarballProviderService.java:39) > at > org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:144) > at > org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:107) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is thrown because the id of the Artifact object is unset > (null) and TarballProviderService.processArtifact verifies it and it does not > allow such artifacts. > The aim of this jira is to add a clarification comment or javadoc to this > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10849) Clarify testcase documentation for TestServiceAM#testContainersReleasedWhenPreLaunchFails
Szilard Nemeth created YARN-10849: - Summary: Clarify testcase documentation for TestServiceAM#testContainersReleasedWhenPreLaunchFails Key: YARN-10849 URL: https://issues.apache.org/jira/browse/YARN-10849 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth There's a small comment added to testcase: org.apache.hadoop.yarn.service.TestServiceAM#testContainersReleasedWhenPreLaunchFails: {code} // Test to verify that the containers are released and the // component instance is added to the pending queue when building the launch // context fails. {code} However, it was not clear for me why the "launch context" would fail. While the test passes, it throws an Exception that tells the story. {code} 2021-07-06 18:31:04,438 ERROR [pool-275-thread-1] containerlaunch.ContainerLaunchService (ContainerLaunchService.java:run(122)) - [COMPINSTANCE compa-0 : container_1625589063422_0001_01_01]: Failed to launch container. java.lang.IllegalArgumentException: Can not create a Path from a null string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:164) at org.apache.hadoop.fs.Path.(Path.java:180) at org.apache.hadoop.yarn.service.provider.tarball.TarballProviderService.processArtifact(TarballProviderService.java:39) at org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:144) at org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:107) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} This exception is thrown because the id of the Artifact object is unset (null) and TarballProviderService.processArtifact verifies it and it does not allow such artifacts. The aim of this jira is to add a clarification comment or javadoc to this method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10789) RM HA startup can fail due to race conditions in ZKConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarun Parimi updated YARN-10789: Attachment: (was: YARN-10789.branch-3.2.001.patch) > RM HA startup can fail due to race conditions in ZKConfigurationStore > - > > Key: YARN-10789 > URL: https://issues.apache.org/jira/browse/YARN-10789 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10789.001.patch, YARN-10789.002.patch, > YARN-10789.branch-3.2.001.patch, YARN-10789.branch-3.3.001.patch > > > We are observing below error randomly during hadoop install and RM initial > startup when HA is enabled and yarn.scheduler.configuration.store.class=zk is > configured. This causes one of the RMs to not startup. > {code:java} > 2021-05-26 12:59:18,986 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state INITED > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = > NodeExists for /confstore/CONF_STORE > {code} > We are trying to create the znode /confstore/CONF_STORE when we initialize > the ZKConfigurationStore. But the problem is that the ZKConfigurationStore is > initialized when CapacityScheduler does a serviceInit. This serviceInit is > done by both Active and Standby RM. So we can run into a race condition when > both Active and Standby try to create the same znode when both RM are started > at same time. > ZKRMStateStore on the other hand avoids such race conditions, by creating the > znodes only after serviceStart. serviceStart only happens for the active RM > which won the leader election, unlike serviceInit which happens irrespective > of leader election. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10789) RM HA startup can fail due to race conditions in ZKConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarun Parimi updated YARN-10789: Attachment: YARN-10789.branch-3.2.001.patch > RM HA startup can fail due to race conditions in ZKConfigurationStore > - > > Key: YARN-10789 > URL: https://issues.apache.org/jira/browse/YARN-10789 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10789.001.patch, YARN-10789.002.patch, > YARN-10789.branch-3.2.001.patch, YARN-10789.branch-3.3.001.patch > > > We are observing below error randomly during hadoop install and RM initial > startup when HA is enabled and yarn.scheduler.configuration.store.class=zk is > configured. This causes one of the RMs to not startup. > {code:java} > 2021-05-26 12:59:18,986 INFO org.apache.hadoop.service.AbstractService: > Service RMActiveServices failed in state INITED > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = > NodeExists for /confstore/CONF_STORE > {code} > We are trying to create the znode /confstore/CONF_STORE when we initialize > the ZKConfigurationStore. But the problem is that the ZKConfigurationStore is > initialized when CapacityScheduler does a serviceInit. This serviceInit is > done by both Active and Standby RM. So we can run into a race condition when > both Active and Standby try to create the same znode when both RM are started > at same time. > ZKRMStateStore on the other hand avoids such race conditions, by creating the > znodes only after serviceStart. serviceStart only happens for the active RM > which won the leader election, unlike serviceInit which happens irrespective > of leader election. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10840) yarn app status fails with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/YARN-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10840: - Attachment: YARN-10840-001.patch > yarn app status fails with ArrayIndexOutOfBoundsException > -- > > Key: YARN-10840 > URL: https://issues.apache.org/jira/browse/YARN-10840 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Abhinaba Sarkar >Assignee: Abhinaba Sarkar >Priority: Major > Attachments: YARN-10840-001.patch > > > Array index out of bounds exception in the ClientAMService.getStatus() - > {code:java} > 2021-07-04 20:00:24,488 [IPC Server handler 0 on 25347] INFO ipc.Server - > IPC Server handler 0 on 25347, call Call#163 Retry#0 > org.apache.hadoop.yarn.service.ClientAMProtocol.getStatus from 10.0.0.10:42446 > org.codehaus.jackson.map.JsonMappingException: Index: 11, Size: 11 (through > reference chain: > org.apache.hadoop.yarn.service.api.records.Service["components"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.Component["containers"]->java.util.ArrayList[11]) > at > org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:218) > at > org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:197) > at > org.codehaus.jackson.map.ser.std.SerializerBase.wrapAndThrow(SerializerBase.java:166) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:127) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) > at > org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86) > at > org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) > at > org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) > at > org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) > at > org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86) > at > org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) > at > org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) > at > org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) > at > org.codehaus.jackson.map.ser.StdSerializerProvider._serializeValue(StdSerializerProvider.java:610) > at > org.codehaus.jackson.map.ser.StdSerializerProvider.serializeValue(StdSerializerProvider.java:256) > at > org.codehaus.jackson.map.ObjectMapper._configAndWriteValue(ObjectMapper.java:2575) > at > org.codehaus.jackson.map.ObjectMapper.writeValueAsString(ObjectMapper.java:2097) > at > org.apache.hadoop.yarn.service.utils.JsonSerDeser.toJson(JsonSerDeser.java:249) > at > org.apache.hadoop.yarn.service.ClientAMService.getStatus(ClientAMService.java:125) > at > org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPBServiceImpl.getStatus(ClientAMProtocolPBServiceImpl.java:59) > at > org.apache.hadoop.yarn.proto.ClientAMProtocol$ClientAMProtocolService$2.callBlockingMethod(ClientAMProtocol.java:6159) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > Caused by: java.lang.IndexOutOfBoundsException: Index: 11, Size: 11 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) > at java.util.ArrayList.get(ArrayList.java:435) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:106) > ... 27 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (YARN-10840) yarn app status fails with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/YARN-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10840: - Attachment: (was: YARN-10840-001.patch) > yarn app status fails with ArrayIndexOutOfBoundsException > -- > > Key: YARN-10840 > URL: https://issues.apache.org/jira/browse/YARN-10840 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Abhinaba Sarkar >Assignee: Abhinaba Sarkar >Priority: Major > Attachments: YARN-10840-001.patch > > > Array index out of bounds exception in the ClientAMService.getStatus() - > {code:java} > 2021-07-04 20:00:24,488 [IPC Server handler 0 on 25347] INFO ipc.Server - > IPC Server handler 0 on 25347, call Call#163 Retry#0 > org.apache.hadoop.yarn.service.ClientAMProtocol.getStatus from 10.0.0.10:42446 > org.codehaus.jackson.map.JsonMappingException: Index: 11, Size: 11 (through > reference chain: > org.apache.hadoop.yarn.service.api.records.Service["components"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.Component["containers"]->java.util.ArrayList[11]) > at > org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:218) > at > org.codehaus.jackson.map.JsonMappingException.wrapWithPath(JsonMappingException.java:197) > at > org.codehaus.jackson.map.ser.std.SerializerBase.wrapAndThrow(SerializerBase.java:166) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:127) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) > at > org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86) > at > org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) > at > org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) > at > org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:122) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:71) > at > org.codehaus.jackson.map.ser.std.AsArraySerializerBase.serialize(AsArraySerializerBase.java:86) > at > org.codehaus.jackson.map.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:446) > at > org.codehaus.jackson.map.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:150) > at > org.codehaus.jackson.map.ser.BeanSerializer.serialize(BeanSerializer.java:112) > at > org.codehaus.jackson.map.ser.StdSerializerProvider._serializeValue(StdSerializerProvider.java:610) > at > org.codehaus.jackson.map.ser.StdSerializerProvider.serializeValue(StdSerializerProvider.java:256) > at > org.codehaus.jackson.map.ObjectMapper._configAndWriteValue(ObjectMapper.java:2575) > at > org.codehaus.jackson.map.ObjectMapper.writeValueAsString(ObjectMapper.java:2097) > at > org.apache.hadoop.yarn.service.utils.JsonSerDeser.toJson(JsonSerDeser.java:249) > at > org.apache.hadoop.yarn.service.ClientAMService.getStatus(ClientAMService.java:125) > at > org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPBServiceImpl.getStatus(ClientAMProtocolPBServiceImpl.java:59) > at > org.apache.hadoop.yarn.proto.ClientAMProtocol$ClientAMProtocolService$2.callBlockingMethod(ClientAMProtocol.java:6159) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > Caused by: java.lang.IndexOutOfBoundsException: Index: 11, Size: 11 > at java.util.ArrayList.rangeCheck(ArrayList.java:659) > at java.util.ArrayList.get(ArrayList.java:435) > at > org.codehaus.jackson.map.ser.std.StdContainerSerializers$IndexedListSerializer.serializeContents(StdContainerSerializers.java:106) > ... 27 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)