[ https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Istvan Fajth closed HDDS-2203. ------------------------------ > Race condition in ByteStringHelper.init() > ----------------------------------------- > > Key: HDDS-2203 > URL: https://issues.apache.org/jira/browse/HDDS-2203 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, SCM > Reporter: Istvan Fajth > Assignee: Istvan Fajth > Priority: Critical > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current init method: > {code} > public static void init(boolean isUnsafeByteOperation) { > final boolean set = INITIALIZED.compareAndSet(false, true); > if (set) { > ByteStringHelper.isUnsafeByteOperationsEnabled = > isUnsafeByteOperation; > } else { > // already initialized, check values > Preconditions.checkState(isUnsafeByteOperationsEnabled > == isUnsafeByteOperation); > } > } > {code} > In a scenario when two thread accesses this method, and the execution order > is the following, then the second thread runs into an exception from > PreCondition.checkState() in the else branch. > In an unitialized state: > - T1 thread arrives to the method with true as the parameter, the class > initialises the isUnsafeByteOperationsEnabled to false > - T1 sets INITIALIZED true > - T2 arrives to the method with true as the parameter > - T2 reads the INITALIZED value and as it is not false goes to else branch > - T2 tries to check if the internal boolean property is the same true as it > wanted to set, and as T1 still to set the value, the checkState throws an > IllegalArgumentException. > This happens in certain Hive query cases, as it came from that testing, the > exception we see there is the following: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, > vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, > diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed > due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, > vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't > create RpcClient protocol > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165) > at > org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:158) > at > org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:50) > at > org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102) > at > org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException > at > org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:129) > at > org.apache.hadoop.hdds.scm.ByteStringHelper.init(ByteStringHelper.java:47) > at org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:241) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:256) > ... 31 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org