Just to close the loop, I just made a branch named HDFS-13572 to match the new non-blocking issue (after some nice encouragement posted up on the JIRA). Thanks, S
On Tue, May 15, 2018 at 9:30 PM, Stack <st...@duboce.net> wrote: > On Fri, May 4, 2018 at 5:47 AM, Anu Engineer <aengin...@hortonworks.com> > wrote: > >> Hi Stack, >> >> >> >> Why don’t we look at the design of what is being proposed? Let us post >> the design to HDFS-9924 and then if needed, by all means let us open a new >> Jira. >> >> That will make it easy to understand the context if someone is looking at >> HDFS-9924. >> >> >> > > I posted a WIP design-for-discussion up on a new issue, HDFS-13572, after > spending a bunch of time in HDFS-9924 and HADOOP-12910 (Duo had posted an > earlier version on HDFS-9924 a while back). > > HDFS-9924 is stalled. It is filled with "discussion" that seems mostly to > be behind where we'd like to take-off (i.e. whether hadoop2 or hadoop3 > first, what is an async api, what is async programming, etc.). We hope to > 'vault' HDFS-9924 by skipping to an hadoop3/jdk8/CompletableFuture basis > and by taking on contributor requests in HDFS-9924 -- e.g. a design first, > dev in a feature branch, and so on -- EXCEPTing the hadoop2 targeting. > > Hence the new issue for a new undertaking (and to save folks having to > wade through reams to get to the new effort). > > > >> I personally believe that it should be the developers of the feature that >> should decide what goes in, what to call the branch etc. But It would be >> nice to have >> >> some sort of continuity of HDFS-9924. >> >> >> > > Agree with the above. I'll take care of tying HDFS-9924 over to the new > issue. > > Thanks, > St.Ack > > > >> Thanks >> >> Anu >> >> >> >> *From: *<saint....@gmail.com> on behalf of Stack <st...@duboce.net> >> *Date: *Thursday, May 3, 2018 at 9:04 PM >> *To: *Anu Engineer <aengin...@hortonworks.com> >> *Cc: *Wei-Chiu Chuang <weic...@apache.org>, "hdfs-dev@hadoop.apache.org" >> <hdfs-dev@hadoop.apache.org> >> *Subject: *Re: [DISCUSSION] Create a branch to work on non-blocking >> access to HDFS >> >> >> >> Thanks for support Wei-Chiu and Anu. >> >> >> >> Thinking more on it, we should just open a new JIRA. HDFS-9924 is an old >> branch with commits we don't need full of commentary that is, ahem, a mite >> off-topic. Duo can attach his design to the new issue. We can cite >> HDFS-9924 as provenance and aggregate the discussion as launching pad for >> the new effort in new issue. >> >> >> >> Hopefully this is agreeable, >> >> Thanks, >> >> >> >> S >> >> >> >> On Thu, May 3, 2018 at 1:54 PM, Anu Engineer <aengin...@hortonworks.com> >> wrote: >> >> Hi St.ack/Wei-Chiu, >> >> It is very kind of St.Ack to bring this question to HDFS Dev. I think >> this is a good feature to have. As for the branch question, >> HDFS-9924 branch is already open, we could just use that and I am +1 on >> adding Duo as a branch committer. >> >> I am not familiar with HBase code base, I am presuming that there will be >> some deviation from the current design >> doc posted in HDFS-9924. Would it be make sense to post a new design >> proposal on HDFS-9924? >> >> --Anu >> >> >> >> >> On 5/3/18, 9:29 AM, "Wei-Chiu Chuang" <weic...@apache.org> wrote: >> >> Given that HBase 2 uses async output by default, the way that code is >> maintained today in HBase is not sustainable. That piece of code >> should be >> maintained in HDFS. I am +1 as a participant in both communities. >> >> On Thu, May 3, 2018 at 9:14 AM, Stack <st...@duboce.net> wrote: >> >> > Ok with you lot if a few of us open a branch to work on a >> non-blocking HDFS >> > client? >> > >> > Intent is to finish up the old issue "HDFS-9924 [umbrella] >> Nonblocking HDFS >> > Access". On the foot of this umbrella JIRA is a proposal by the >> > heavy-lifter, Duo Zhang. Over in HBase, we have a limited async DFS >> client >> > (written by Duo) that we use making Write-Ahead Logs. We call it >> > AsyncFSWAL. It was shipped as the default WAL writer in hbase-2.0.0. >> > >> > Let me quote Duo from his proposal at the base of HDFS-9924: >> > >> > ....We use lots of internal APIs of HDFS to implement the >> AsyncFSWAL, so it >> > is expected that things like HBASE-20244 >> > <https://issues.apache.org/jira/browse/HBASE-20244> >> > ["NoSuchMethodException >> > when retrieving private method decryptEncryptedDataEncryptionKey >> from >> > DFSClient"] will happen again and again. >> > >> > To make life easier, we need to move the async output related code >> into >> > HDFS. The POC [attached as patch on HDFS-9924] shows that option 3 >> [1] can >> > work, so I would like to create a feature branch to implement the >> async dfs >> > client. In general I think there are 4 steps: >> > >> > 1. Implement an async rpc client with option 3 [1] described above. >> > 2. Implement the filesystem APIs which only need to connect to NN, >> such as >> > 'mkdirs'. >> > 3. Implement async file read. The problem is the API. For pread I >> think a >> > CompletableFuture is enough, the problem is for the streaming read. >> Need to >> > discuss later. >> > 4. Implement async file write. The API will also be a problem, but >> a more >> > important problem is that, if we want to support fan-out, the >> current logic >> > at DN side will make the semantic broken as we can read uncommitted >> data >> > very easily. In HBase it is solved by HBASE-14004 >> > <https://issues.apache.org/jira/browse/HBASE-14004> but I do not >> think we >> > should keep the broken behavior in HDFS. We need to find a way to >> deal with >> > it. >> > >> > Comments welcome. >> > >> > Intent is to make a branch named HDFS-9924 (or should we just do a >> new >> > JIRA?) and to add Duo as a feature branch committer. If all goes >> well, >> > we'll call for a merge VOTE. >> > >> > Thanks, >> > St.Ack >> > >> > 1.Option 3: "Use the old protobuf rpc interface and implement a >> new rpc >> > framework. The benefit is that we also do not need port unification >> service >> > at server side and do not need to maintain two implementations at >> server >> > side. And one more thing is that we do not need to upgrade protobuf >> to >> > 3.x." >> > >> >> >> >> -- >> A very happy Hadoop contributor >> >> >> > >