[ 
https://issues.apache.org/jira/browse/HDDS-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1121:
-------------------------------------
    Description: 
When hive is run with multiple threads for data ingestion to ozone. After 
ingestion is done, during read we see this below error.

This issue is found during hive testing, and found by [~t3rmin4t0r]
{code:java}
caused by: org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum 
mismatch at index 0
 at 
org.apache.hadoop.ozone.common.ChecksumData.verifyChecksumDataMatches(ChecksumData.java:143)
 at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:239)
 at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:217)
 at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.readChunkFromContainer(BlockInputStream.java:227)
 at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.seek(BlockInputStream.java:259)
 at 
org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.seek(KeyInputStream.java:249)
 at 
org.apache.hadoop.ozone.client.io.KeyInputStream.seek(KeyInputStream.java:180)
 at 
org.apache.hadoop.fs.ozone.OzoneFSInputStream.seek(OzoneFSInputStream.java:62)
 at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:82)
 at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
 at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)
 at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:555)
 at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:370)
 at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:61)
 at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:105)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:1647)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1533)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2700(OrcInputFormat.java:1329)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1513)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1510)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1510)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1329)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266){code}

  was:
When hive is run with multiple threads for data ingestion to ozone. After 
ingestion is done, during read we see this below error.

This issue is found during hive testing.
{code:java}
caused by: org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum 
mismatch at index 0
 at 
org.apache.hadoop.ozone.common.ChecksumData.verifyChecksumDataMatches(ChecksumData.java:143)
 at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:239)
 at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:217)
 at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.readChunkFromContainer(BlockInputStream.java:227)
 at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.seek(BlockInputStream.java:259)
 at 
org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.seek(KeyInputStream.java:249)
 at 
org.apache.hadoop.ozone.client.io.KeyInputStream.seek(KeyInputStream.java:180)
 at 
org.apache.hadoop.fs.ozone.OzoneFSInputStream.seek(OzoneFSInputStream.java:62)
 at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:82)
 at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
 at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)
 at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:555)
 at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:370)
 at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:61)
 at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:105)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:1647)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1533)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2700(OrcInputFormat.java:1329)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1513)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1510)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1510)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1329)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266){code}


> Key read failure when data is written parallel in to Ozone
> ----------------------------------------------------------
>
>                 Key: HDDS-1121
>                 URL: https://issues.apache.org/jira/browse/HDDS-1121
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Bharat Viswanadham
>            Assignee: Bharat Viswanadham
>            Priority: Major
>         Attachments: HDDS-1121.00.patch
>
>
> When hive is run with multiple threads for data ingestion to ozone. After 
> ingestion is done, during read we see this below error.
> This issue is found during hive testing, and found by [~t3rmin4t0r]
> {code:java}
> caused by: org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum 
> mismatch at index 0
>  at 
> org.apache.hadoop.ozone.common.ChecksumData.verifyChecksumDataMatches(ChecksumData.java:143)
>  at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:239)
>  at org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:217)
>  at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readChunkFromContainer(BlockInputStream.java:227)
>  at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.seek(BlockInputStream.java:259)
>  at 
> org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.seek(KeyInputStream.java:249)
>  at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.seek(KeyInputStream.java:180)
>  at 
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.seek(OzoneFSInputStream.java:62)
>  at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:82)
>  at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
>  at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)
>  at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:555)
>  at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:370)
>  at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:61)
>  at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:105)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:1647)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1533)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2700(OrcInputFormat.java:1329)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1513)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1510)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1510)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1329)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to