Re: Hadoop storage community online sync

Matt Foley Thu, 22 Aug 2019 11:11:20 -0700

+1 for publishing notes.  Thanks!

On Aug 21, 2019, at 4:16 PM, Aaron Fabbri <[email protected]> wrote:


Thank you Wei-Chiu for organizing this and sending out notes!

On Wed, Aug 21, 2019 at 1:10 PM Wei-Chiu Chuang <[email protected] 
<mailto:[email protected]>> wrote:

> We had a great turnout today, thanks to Konstantin for leading the
> discussion of the NameNode Fine-Grained Locking proposal.
> 
> There were at least 16 participants joined the call.
> 
> Today's summary can be found here:
> 
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit#
> 
> 8/19/2019
> 
> We are moving the sync to 10AM US PDT!
> 
> NameNode Fine-Grained Locking via InMemory Namespace Partitioning
> 
> Attendee:
> 
> Konstantin, Chen, Weichiu, Xiaoyu, Anu, Matt, pljeliazkov, Chao Sun, Clay,
> Bharat Viswanadham, Matt, Craig Condit, Matthew Sharp, skumpf, Artem
> Ervits, Mohammad J Khan, Nanda, Alex Moundalexis.
> 
> Konstantin lead the discussion of HDFS-14703
> <https://issues.apache.org/jira/browse/HDFS-14703 
> <https://issues.apache.org/jira/browse/HDFS-14703>>.
> 
> There are three important parts:
> 
> (1) Partition namespace into multiple GSet, different part of namespace can
> be processed in parallel.
> 
> (2) INode Key
> 
> (3) Latch lock
> 
> How to support snapshot —> should be able to get partitioned similarly.
> 
> Balance partition strategies: several possible ways. Dynamic partition
> strategy, Static partitioning strategy —> no need a higher level navigation
> lock.
> 
> Dynamic strategy: starting with 1, and grow.
> 
> And: why does the design doc use static partitioning? determining the size
> of partitions is hard. what about starting with 1024 partitions.
> 
> Hotspot problem
> 
> A related task, HDFS-14617
> <https://issues.apache.org/jira/browse/HDFS-14617 
> <https://issues.apache.org/jira/browse/HDFS-14617>> (Improve fsimage load
> time by writing sub-sections to the fsimage index) writes multiple inode
> sections and inode directory sections, and load sections in parallel. It
> sounds like we can combine it with the fine-grained locking and partition
> inode/inode directory sections by the namespace partitions.
> 
> Anu: snapshot complicates design. Renames. Copy on write?
> 
> Anu: suggest to implement this feature without snapshot support to simplify
> design and implementation.
> 
> Konstantin: will develop in a feature branch. Feel free to pick up jiras or
> share thoughts.
> 
> FoldedTreeSet implemented in HDFS-9260
> <https://issues.apache.org/jira/browse/HDFS-9260 
> <https://issues.apache.org/jira/browse/HDFS-9260>> is relevant. Need to fix
> or revert before developing the namespace partitioning feature.
> 
> On Mon, Aug 19, 2019 at 2:55 PM Wei-Chiu Chuang <[email protected] 
> <mailto:[email protected]>>
> wrote:
> 
>> For this week,
>> We will have Konstantin and the LinkedIn folks to discuss a recent
> project
>> that's been baking for quite a while. This is an exciting project as it
> has
>> the potential to improve NameNode's throughput by 40%.
>> 
>> HDFS-14703 <https://issues.apache.org/jira/browse/HDFS-14703 
>> <https://issues.apache.org/jira/browse/HDFS-14703>> NameNode
>> Fine-Grained Locking
>> 
>> Access instruction, and the past sync notes are available here:
>> 
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing
>  
> <https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing>
>> 
>> Reminder: We have Bi-weekly Hadoop storage online sync every other
>> Wednesday.
>> If there are no objections, I'd like to move the time to 10AM US pacific
>> time (GMT-8)

Re: Hadoop storage community online sync

Reply via email to