[ https://issues.apache.org/jira/browse/HDFS-16732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636765#comment-17636765 ]
Erik Krogen commented on HDFS-16732: ------------------------------------ Note that a bug with this change was reported, and now fixed, in HDFS-16832. > [SBN READ] Avoid get location from observer when the block report is delayed. > ----------------------------------------------------------------------------- > > Key: HDFS-16732 > URL: https://issues.apache.org/jira/browse/HDFS-16732 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 3.2.1 > Reporter: zhengchenyu > Assignee: zhengchenyu > Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Hive on tez application fail occasionally after observer is enable, log show > below. > {code:java} > 2022-08-18 15:22:06,914 [ERROR] [Dispatcher thread {Central}] > |impl.VertexImpl|: Vertex Input: namenodeinfo_stg initializer failed, > vertex=vertex_1660618571916_4839_1_00 [Map 1] > org.apache.tez.dag.app.dag.impl.AMUserCodeException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallback.onFailure(RootInputInitializerManager.java:329) > at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > at > com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.afterRanInterruptibly(TrustedListenableFutureTask.java:133) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:80) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:748) > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:714) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:378) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:159) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:279) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:270) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:254) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) > ... 4 more {code} > As describe in MAPREDUCE-7082, when the block is missing, then will throw > this exception, but my cluster had no missing block. > In this example, I found getListing return location information. When block > report of observer is delayed, will return the block without location. > HDFS-13924 is introduce to solve this problem, but only consider > getBlockLocations. > In observer node, all method which may return location should check whether > locations is empty or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org