[ https://issues.apache.org/jira/browse/HDDS-2359?focusedWorklogId=333473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333473 ]
ASF GitHub Bot logged work on HDDS-2359: ---------------------------------------- Author: ASF GitHub Bot Created on: 24/Oct/19 14:42 Start Date: 24/Oct/19 14:42 Worklog Time Spent: 10m Work Description: bshashikant commented on pull request #82: HDDS-2359. Seeking randomly in a key with more than 2 blocks of data leads to inconsistent reads URL: https://github.com/apache/hadoop-ozone/pull/82 ## What changes were proposed in this pull request? The issue was primarily caused when first seek to an offset , then read followed by seek to a different offset and read data again both containing overlapping set of chunks . Once a seek to a position is done, the chunkPosition inside each blockInputStream is not correctly set to 0 thereby, the 1st which to which the seek offset belongs is correctly read but for the next subsequent chunks , data to be read will be returned as zero as a result of which , all the read for the subsequent chunks will return length to be read as 0. The solution here is to reset all the subsequent chunks for all subsequent blocks after a seek to set to 0 so once that it will start read from the beginning of each chunk. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2359 ## How was this patch tested? The patch was tested with addition of unit tests which reliably reproduce the issue. This was also deployed in real cluster where the issue was first discovered and verified. Thanks @fapifta for discovering the issue and help verifying the fix as well. Thanks @bharatviswa504 and @hanishakoneru for the contribution in the fix provided. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 333473) Remaining Estimate: 0h Time Spent: 10m > Seeking randomly in a key with more than 2 blocks of data leads to > inconsistent reads > ------------------------------------------------------------------------------------- > > Key: HDDS-2359 > URL: https://issues.apache.org/jira/browse/HDDS-2359 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Istvan Fajth > Assignee: Shashikant Banerjee > Priority: Critical > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > During Hive testing we found the following exception: > {code} > TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : > attempt_1569246922012_0214_1_03_000000_3:java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: > java.io.IOException: error iterating > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.io.IOException: error iterating > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) > ... 16 more > Caused by: java.io.IOException: java.io.IOException: error iterating > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:366) > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) > at > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) > ... 18 more > Caused by: java.io.IOException: error iterating > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:835) > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:74) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361) > ... 24 more > Caused by: java.io.IOException: Error reading file: > o3fs://hive.warehouse.vc0136.halxg.cloudera.com:9862/data/inventory/delta_0000001_0000001_0000/bucket_00000 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1283) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:156) > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:150) > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:146) > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:831) > ... 26 more > Caused by: java.io.IOException: Inconsistent read for blockID=conID: 2 locID: > 102851451236759576 bcsId: 14608 length=26398272 numBytesRead=6084153 > at > org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:176) > at > org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:52) > at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:75) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:557) > at > org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readFileData(RecordReaderUtils.java:276) > at > org.apache.orc.impl.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:1189) > at > org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1057) > at > org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1208) > at > org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1243) > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1279) > ... 30 more > {code} > Evaluating the code path, the following is the issue: > given a file with more data than 2 blocks > when there are random seeks in the file to the end then to the beginning > then the read fails with the final cause of the exception above. > [~shashikant] has a solution already for this issue, which we have > successfully tested internally with Hive, I am assigning this JIRA to him to > post the PR. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org