This seems like a really aggressive timeframe for a merge. We still haven't implemented:
* Checksum skipping on read and write from lazy persisted replicas. * Allowing mmaped reads from the lazy persisted data. * Any eviction strategy other than LRU. * Integration with cache pool limits (how do HDFS-4949 and lazy persist replicas share memory)? * Eviction from RAM disk via truncation (HDFS-6918) * Metrics * System testing to find out how useful this is, and what the best eviction strategy is. I see why we might want to defer checksum skipping, metrics, allowing mmap, eviction via truncation, and so forth until later. But I feel like we need to figure out how this will integrate with the memory used by HDFS-4949 before we merge. I also would like to see another eviction strategy other than LRU, which is a very poor eviction strategy for scanning workloads. I mentioned this a few times on the JIRA. I'd also like to get some idea of how much testing this has received in a multi-node cluster. What makes us confident that this is the right time to merge, rather than in a week or two? best, Colin On Tue, Sep 23, 2014 at 4:55 PM, Arpit Agarwal <aagar...@hortonworks.com> wrote: > I have posted write benchmark results to the Jira. > > On Tue, Sep 23, 2014 at 3:41 PM, Arpit Agarwal <aagar...@hortonworks.com> > wrote: > >> Hi Andrew, I said "it is not going to be a substantial fraction of memory >> bandwidth". That is certainly not the same as saying it won't be good or >> there won't be any improvement. >> >> Any time you have transfers over RPC or the network stack you will not get >> close to the memory bandwidth even for intra-host transfers. >> >> I'll add some micro-benchmark results to the Jira shortly. >> >> Thanks, >> Arpit >> >> On Tue, Sep 23, 2014 at 2:33 PM, Andrew Wang <andrew.w...@cloudera.com> >> wrote: >> >>> Hi Arpit, >>> >>> Here is the comment. It was certainly not my intention to misquote anyone. >>> >>> >>> https://issues.apache.org/jira/browse/HDFS-6581?focusedCommentId=14138223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14138223 >>> >>> Quote: >>> >>> It would be nice to see that would could get a substantial fraction of >>> memory bandwidth when writing to a single replica in-memory. >>> >>> The comparison will be interesting but I can tell you without measurement >>> it is not going to be a substantial fraction of memory bandwidth. We are >>> still going through DataTransferProtocol with all the copies and overhead >>> that involves. >>> >>> When the goal is in-memory writes and we are unable to achieve a >>> substantial fraction of memory bandwidth, to me that is "not good >>> performance." >>> >>> I also looked through the subtasks, and AFAICT the only one related to >>> improving this is deferring checksum computation. The benchmarking we did >>> on HDFS-4949 showed that this only really helps when you're down to single >>> copy or zero copies with SCR/ZCR. DTP reads didn't see much of an >>> improvement, so I'd guess the same would be true for DTP writes. >>> >>> I think my above three questions are still open, as well as my question >>> about why we're merging now, as opposed to when the performance of the >>> branch is proven out. >>> >>> Thanks, >>> Andrew >>> >>> On Tue, Sep 23, 2014 at 2:10 PM, Arpit Agarwal <aagar...@hortonworks.com> >>> wrote: >>> >>> > Andrew, don't misquote me. Can you link the comment where I said >>> > performance wasn't going to be good? >>> > >>> > I will add some add some preliminary write results to the Jira later >>> today. >>> > >>> > > What's the plan to improve write performance? >>> > I described this in response to your and Colin's comments on the Jira. >>> > >>> > For the benefit of folks not following the Jira, the immediate task we'd >>> > like to get done post-merge is moving checksum computation off the write >>> > path. Also see open subtasks of HDFS-6581 for other planned perf >>> > improvements. >>> > >>> > Thanks, >>> > Arpit >>> > >>> > >>> > On Tue, Sep 23, 2014 at 1:07 PM, Andrew Wang <andrew.w...@cloudera.com> >>> > wrote: >>> > >>> > > Hi Arpit, >>> > > >>> > > On HDFS-6581, I asked for write benchmarks on Sep 19th, and you >>> responded >>> > > that the performance wasn't going to be good. However, I thought the >>> > > primary goal of this JIRA was to improve write performance, and write >>> > > performance is listed as the first feature requirement in the design >>> doc. >>> > > >>> > > So, this leads me to a few questions, which I also asked last week on >>> the >>> > > JIRA (I believe still unanswered): >>> > > >>> > > - What's the plan to improve write performance? >>> > > - What kind of performance can we expect after the plan is completed? >>> > > - Can this expected performance be validated with a prototype? >>> > > >>> > > Even with these questions answered, I don't understand the need to >>> merge >>> > > this before the write optimization work is completed. Write perf is >>> > listed >>> > > as a feature requirement, so the branch can reasonably be called not >>> > > feature complete until it's shown to be faster. >>> > > >>> > > Thanks, >>> > > Andrew >>> > > >>> > > On Tue, Sep 23, 2014 at 11:47 AM, Jitendra Pandey < >>> > > jiten...@hortonworks.com> >>> > > wrote: >>> > > >>> > > > +1. I have reviewed most of the code in the branch, and I think its >>> > ready >>> > > > to be merged to trunk. >>> > > > >>> > > > >>> > > > On Mon, Sep 22, 2014 at 5:24 PM, Arpit Agarwal < >>> > aagar...@hortonworks.com >>> > > > >>> > > > wrote: >>> > > > >>> > > > > HDFS Devs, >>> > > > > >>> > > > > We propose merging the HDFS-6581 development branch to trunk. >>> > > > > >>> > > > > The work adds support to write to HDFS blocks in memory. The >>> target >>> > use >>> > > > > case covers applications writing relatively small, intermediate >>> data >>> > > sets >>> > > > > with low latency. We introduce a new CreateFlag for the existing >>> > > > CreateFile >>> > > > > API. HDFS will subsequently attempt to place replicas of file >>> blocks >>> > in >>> > > > > local memory with disk writes occurring off the hot path. The >>> current >>> > > > > design is a simplification of original ideas from Sanjay Radia on >>> > > > > HDFS-5851. >>> > > > > >>> > > > > Key goals of the feature were minimal API changes to reduce >>> > application >>> > > > > burden and best effort data durability. The feature is optional >>> and >>> > > > > requires appropriate DN configuration from administrators. >>> > > > > >>> > > > > Design doc: >>> > > > > >>> > > > > >>> > > > >>> > > >>> > >>> https://issues.apache.org/jira/secure/attachment/12661926/HDFSWriteableReplicasInMemory.pdf >>> > > > > >>> > > > > Test plan: >>> > > > > >>> > > > > >>> > > > >>> > > >>> > >>> https://issues.apache.org/jira/secure/attachment/12669452/Test-Plan-for-HDFS-6581-Memory-Storage.pdf >>> > > > > >>> > > > > There are 28 resolved sub-tasks under HDFS-6581, 3 open tasks for >>> > > > > tests+Jenkins issues and 7 open subtasks tracking planned >>> > > improvements. >>> > > > > The latest merge patch is 3300 lines of changed code of which 1300 >>> > > lines >>> > > > is >>> > > > > new and updated tests. Merging the branch to trunk will allow HDFS >>> > > > > applications to start evaluating the feature. We will continue >>> work >>> > on >>> > > > > documentation, performance tuning and metrics in parallel with the >>> > vote >>> > > > and >>> > > > > post-merge. >>> > > > > >>> > > > > Contributors to design and code include Xiaoyu Yao, Sanjay Radia, >>> > > > Jitendra >>> > > > > Pandey, Tassapol Athiapinya, Gopal V, Bikas Saha, Vikram Dixit, >>> > Suresh >>> > > > > Srinivas and Chris Nauroth. >>> > > > > >>> > > > > Thanks to Haohui Mai, Colin Patrick McCabe, Andrew Wang, Todd >>> Lipcon, >>> > > > Eric >>> > > > > Baldeschwieler and Vinayakumar B for providing useful feedback on >>> > > > > HDFS-6581, HDFS-5851 and sub-tasks. >>> > > > > >>> > > > > The vote runs for the usual 7 days and will expire at 12am PDT on >>> Sep >>> > > 30. >>> > > > > Here is my +1 for the merge. >>> > > > > >>> > > > > Regards, >>> > > > > Arpit >>> > > > > >>> > > > > -- >>> > > > > CONFIDENTIALITY NOTICE >>> > > > > NOTICE: This message is intended for the use of the individual or >>> > > entity >>> > > > to >>> > > > > which it is addressed and may contain information that is >>> > confidential, >>> > > > > privileged and exempt from disclosure under applicable law. If the >>> > > reader >>> > > > > of this message is not the intended recipient, you are hereby >>> > notified >>> > > > that >>> > > > > any printing, copying, dissemination, distribution, disclosure or >>> > > > > forwarding of this communication is strictly prohibited. If you >>> have >>> > > > > received this communication in error, please contact the sender >>> > > > immediately >>> > > > > and delete it from your system. Thank You. >>> > > > > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > <http://hortonworks.com/download/> >>> > > > >>> > > > -- >>> > > > CONFIDENTIALITY NOTICE >>> > > > NOTICE: This message is intended for the use of the individual or >>> > entity >>> > > to >>> > > > which it is addressed and may contain information that is >>> confidential, >>> > > > privileged and exempt from disclosure under applicable law. If the >>> > reader >>> > > > of this message is not the intended recipient, you are hereby >>> notified >>> > > that >>> > > > any printing, copying, dissemination, distribution, disclosure or >>> > > > forwarding of this communication is strictly prohibited. If you have >>> > > > received this communication in error, please contact the sender >>> > > immediately >>> > > > and delete it from your system. Thank You. >>> > > > >>> > > >>> > >>> > -- >>> > CONFIDENTIALITY NOTICE >>> > NOTICE: This message is intended for the use of the individual or >>> entity to >>> > which it is addressed and may contain information that is confidential, >>> > privileged and exempt from disclosure under applicable law. If the >>> reader >>> > of this message is not the intended recipient, you are hereby notified >>> that >>> > any printing, copying, dissemination, distribution, disclosure or >>> > forwarding of this communication is strictly prohibited. If you have >>> > received this communication in error, please contact the sender >>> immediately >>> > and delete it from your system. Thank You. >>> > >>> >> >> > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.