[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292864#comment-14292864 ]
Jing Zhao commented on HDFS-7339: --------------------------------- bq. Change the hash function so that consecutive IDs will be mapped to the same hash value and implement BlockGroup.equal(..) so that it returns true with any block id in the group. Had an offline discussion with [~szetszwo] about this just now. This new hash function will cause extra scanning in the bucket, since every 16 contiguous blocks will be mapped to the same bucket. Currently for a large cluster the blocksMap can contain several million buckets, which is in the same scale of the total number of blocks. Thus the current implementation will not have a lot of bucket scan in normal case. Therefore I guess we may need to revisit this optimization and maybe do a simple benchmark about it. Back to this jira, maybe we should consider providing a relative simple implementation first and do optimization in a separate jira. Either only using blocksMap or allocating an extra blockgroupsMap looks fine to me. Maybe we should also schedule an offline discussion sometime this week. > Allocating and persisting block groups in NameNode > -------------------------------------------------- > > Key: HDFS-7339 > URL: https://issues.apache.org/jira/browse/HDFS-7339 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Zhe Zhang > Assignee: Zhe Zhang > Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, > HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, > HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg > > > All erasure codec operations center around the concept of _block group_; they > are formed in initial encoding and looked up in recoveries and conversions. A > lightweight class {{BlockGroup}} is created to record the original and parity > blocks in a coding group, as well as a pointer to the codec schema (pluggable > codec schemas will be supported in HDFS-7337). With the striping layout, the > HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. > Therefore we propose to extend a file’s inode to switch between _contiguous_ > and _striping_ modes, with the current mode recorded in a binary flag. An > array of BlockGroups (or BlockGroup IDs) is added, which remains empty for > “traditional” HDFS files with contiguous block layout. > The NameNode creates and maintains {{BlockGroup}} instances through the new > {{ECManager}} component; the attached figure has an illustration of the > architecture. As a simple example, when a {_Striping+EC_} file is created and > written to, it will serve requests from the client to allocate new > {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, > {{BlockGroups}} are allocated both in initial online encoding and in the > conversion from replication to EC. {{ECManager}} also facilitates the lookup > of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)