We had a great turnout today, thanks to Konstantin for leading the discussion of the NameNode Fine-Grained Locking proposal.
There were at least 16 participants joined the call. Today's summary can be found here: https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit# 8/19/2019 We are moving the sync to 10AM US PDT! NameNode Fine-Grained Locking via InMemory Namespace Partitioning Attendee: Konstantin, Chen, Weichiu, Xiaoyu, Anu, Matt, pljeliazkov, Chao Sun, Clay, Bharat Viswanadham, Matt, Craig Condit, Matthew Sharp, skumpf, Artem Ervits, Mohammad J Khan, Nanda, Alex Moundalexis. Konstantin lead the discussion of HDFS-14703 <https://issues.apache.org/jira/browse/HDFS-14703>. There are three important parts: (1) Partition namespace into multiple GSet, different part of namespace can be processed in parallel. (2) INode Key (3) Latch lock How to support snapshot —> should be able to get partitioned similarly. Balance partition strategies: several possible ways. Dynamic partition strategy, Static partitioning strategy —> no need a higher level navigation lock. Dynamic strategy: starting with 1, and grow. And: why does the design doc use static partitioning? determining the size of partitions is hard. what about starting with 1024 partitions. Hotspot problem A related task, HDFS-14617 <https://issues.apache.org/jira/browse/HDFS-14617> (Improve fsimage load time by writing sub-sections to the fsimage index) writes multiple inode sections and inode directory sections, and load sections in parallel. It sounds like we can combine it with the fine-grained locking and partition inode/inode directory sections by the namespace partitions. Anu: snapshot complicates design. Renames. Copy on write? Anu: suggest to implement this feature without snapshot support to simplify design and implementation. Konstantin: will develop in a feature branch. Feel free to pick up jiras or share thoughts. FoldedTreeSet implemented in HDFS-9260 <https://issues.apache.org/jira/browse/HDFS-9260> is relevant. Need to fix or revert before developing the namespace partitioning feature. On Mon, Aug 19, 2019 at 2:55 PM Wei-Chiu Chuang <weic...@cloudera.com> wrote: > For this week, > We will have Konstantin and the LinkedIn folks to discuss a recent project > that's been baking for quite a while. This is an exciting project as it has > the potential to improve NameNode's throughput by 40%. > > HDFS-14703 <https://issues.apache.org/jira/browse/HDFS-14703> NameNode > Fine-Grained Locking > > Access instruction, and the past sync notes are available here: > https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing > > Reminder: We have Bi-weekly Hadoop storage online sync every other > Wednesday. > If there are no objections, I'd like to move the time to 10AM US pacific > time (GMT-8) >