[ https://issues.apache.org/jira/browse/HDFS-13642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499852#comment-16499852 ]
SammiChen commented on HDFS-13642: ---------------------------------- Some comments, 1. *private static final int BLOCK_SIZE = 1 << 20; // 16k* change the comments from 16k to 1MB 2. {quote} if (!shouldReplicate) { final ErasureCodingPolicy ecPolicy = FSDirErasureCodingOp .getErasureCodingPolicy(this, ecPolicyName, iip); if (ecPolicy != null && (!ecPolicy.isReplicationPolicy())) { if (blockSize < ecPolicy.getCellSize()) { throw new IOException("Specified block size " + blockSize + " is less than the cell" + " size (" + ecPolicy.getCellSize() + ") of the erasure coding policy on this file."); } } } {quote} When create a normal 3-replica file, {{shouldReplicate}} value is false. This value is true when user set the {{CreateFlag.SHOULD_REPLICATE}} explicitly when calling the create API. One suggestion is adding the block size, cell size compare statements as the else statement of {quote} if (shouldReplicate || (org.apache.commons.lang.StringUtils.isEmpty(ecPolicyName) && !FSDirErasureCodingOp.hasErasureCodingPolicy(this, iip))) { blockManager.verifyReplication(src, replication, clientMachine); } {quote} Thanks for working on it, [~xiaochen]. > Creating a file with block size smaller than EC policy's cell size should > throw > ------------------------------------------------------------------------------- > > Key: HDFS-13642 > URL: https://issues.apache.org/jira/browse/HDFS-13642 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding > Affects Versions: 3.0.0 > Reporter: Xiao Chen > Assignee: Xiao Chen > Priority: Major > Attachments: HDFS-13642.01.patch, HDFS-13642.02.patch, editsStored > > > The following command causes an exception: > {noformat} > hadoop fs -Ddfs.block.size=349696 -put -f lineitem_sixblocks.parquet > /test-warehouse/tmp123ec > {noformat} > {noformat} > 18/05/25 16:00:59 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: BlockSize 349696 < lastByteOffsetInBlock, #0: > blk_-9223372036854574256_14634, packet seqno: 7 offsetInBlock: 349696 > lastPacketInBlock: false lastByteOffsetInBlock: 350208 > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:729) > at > org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46) > 18/05/25 16:00:59 WARN hdfs.DFSOutputStream: Failed: offset=4096, length=512, > DFSStripedOutputStream:#0: failed, blk_-9223372036854574256_14634 > java.io.IOException: BlockSize 349696 < lastByteOffsetInBlock, #0: > blk_-9223372036854574256_14634, packet seqno: 7 offsetInBlock: 349696 > lastPacketInBlock: false lastByteOffsetInBlock: 350208 > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:729) > at > org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46) > {noformat} > Then the streamer is confused and hangs. > The local file is under 6MB, the hdfs file has a RS-3-2-1024k EC policy. > > Credit to [~tarasbob] for reporting this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org