Brahma, thank you for the comments. i) I can send a patch with the diff between branches. ii) Working with Giovanni for the review. iii) We had some numbers in our cluster. iv) We could have a Router just for giving a view of all the namespaces without giving RPC accesses. Another case might be only allowing WebHDFS and not RPC. We could consolidate nevertheless. I will open a JIRA to extend the documentation with the configuration keys. v) I'm open to do more tests. I think the guys from LinkedIn wanted to test some more frameworks in their dev setup. In addition, before merging, I'd run the version in trunk for a few days. v) Good catches, I'll open JIRAs for those.
On Mon, Aug 28, 2017 at 6:12 AM, Brahma Reddy Battula < brahmareddy.batt...@huawei.com> wrote: > Nice Feature, Great work Guys. Looking forward getting in this, as already > YARN federation is in. > > At first glance I have few questions > > i) Could have a consolidated patch for better review..? > > ii) Hoping "Federation Metrics" and "Federation UI" will be included. > > iii) do we've RPC benchmarks ? > > iv) As of now "dfs.federation.router.rpc.enable" and > "dfs.federation.router.store.enable" made "true", does we need to keep > this configs..? since without this router might not be useful..? > > iv) bq. The rest of the options are documented in [hdfs-default.xml] > I feel, better to document all the configurations. I see, there are so > many, how about document in tabular format..? > > v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks > you mentioned, is that enough..? > > v) mvn install (and package) is failing with following error > > [INFO] Adding ignore: * > [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses > failed with message: > Duplicate classes found: > > Found in: > org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0- > beta1-SNAPSHOT:compile > org.apache.hadoop:hadoop-client-runtime:jar:3.0.0- > beta1-SNAPSHOT:compile > Duplicate classes: > org/apache/hadoop/shaded/org/apache/curator/framework/api/ > DeleteBuilder.class > org/apache/hadoop/shaded/org/apache/curator/framework/ > CuratorFramework.class > > > I added "hadoop-client-minicluster" to ignore list to get success > > hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml > > <dependencies> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-annotations</artifactId> > <ignoreClasses> > <ignoreClass>*</ignoreClass> > </ignoreClasses> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-client-minicluster</artifactId> > <ignoreClasses> > <ignoreClass>*</ignoreClass> > </ignoreClasses> > </dependency> > > > Please correct me If I am wrong. > > > --Brahma Reddy Battula > > -----Original Message----- > From: Chris Douglas [mailto:cdoug...@apache.org] > Sent: 25 August 2017 06:37 > To: Andrew Wang > Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; su...@apache.org > Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk > > On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang <andrew.w...@cloudera.com> > wrote: > > Do you mind holding this until 3.1? Same reasoning as for the other > > branch merge proposals, we're simply too late in the 3.0.0 release cycle. > > That wouldn't be too dire. > > That said, this has the same design and impact as YARN federation. > Specifically, it sits almost entirely outside core HDFS, so it will not > affect clusters running without R-BF. > > Merging would allow the two router implementations to converge on a common > backend, which has started with HADOOP-14741 [1]. If the HDFS side only > exists in 3.1, then that work would complicate maintenance of YARN in > 3.0.x, which may require bug fixes as it stabilizes. > > Merging lowers costs for maintenance with a nominal risk to stability. > The feature is well tested, deployed, and actively developed. The > modifications to core HDFS [2] (~23k) are trivial. > > So I'd still advocate for this particular merge on those merits. -C > > [1] https://issues.apache.org/jira/browse/HADOOP-14741 > [2] git diff --diff-filter=M $(git merge-base apache/HDFS-10467 > apache/trunk)..apache/HDFS-10467 > > > On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas <cdoug...@apache.org> > wrote: > >> > >> I'd definitely support merging this to trunk. The implementation is > >> almost entirely outside of HDFS and, as Inigo detailed, has been > >> tested at scale. The branch is in a functional state with > >> documentation and tests. -C > >> > >> On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri <elgo...@gmail.com> wrote: > >> > Hi all, > >> > > >> > > >> > > >> > We would like to open a discussion on merging the Router-based > >> > Federation feature to trunk. > >> > > >> > Last week, there was a thread about which branches would go into > >> > 3.0 and given that YARN federation is going, this might be a good > >> > time for this to be merged too. > >> > > >> > > >> > We have been running "Router-based federation" in production for a > year. > >> > > >> > Meanwhile, we have been releasing it in a feature branch > >> > (HDFS-10467 > >> > [1]) > >> > for a while. > >> > > >> > We are reasonably confident that the state of the branch is about > >> > to meet the criteria to be merged onto trunk. > >> > > >> > > >> > *Feature*: > >> > > >> > This feature aggregates multiple namespaces into a single one > >> > transparently to the user. > >> > > >> > It has a similar architecture to YARN federation (YARN-2915). > >> > > >> > It consists on Routers that handle requests from the clients and > >> > forwards them to the right subcluster and exposes the same API as > >> > the Namenode. > >> > > >> > Currently we use a mount table (similar to ViewFs) but can be > >> > replaced by other approaches. > >> > > >> > The Routers share their state in a State Store. > >> > > >> > > >> > > >> > The main advantage is that clients interact with the Routers as > >> > they were Namenode so there is no changes in the client required > >> > other than poiting to the right address. > >> > > >> > In addition, all the management is moved to the server side so > >> > changes to the Mount Table can be done without having to sync the > >> > clients (pull/push). > >> > > >> > > >> > > >> > *Status*: > >> > > >> > The branch already contains all the features required to work > >> > end-to-end. > >> > > >> > There are a couple open JIRAs that would be required for the merged > >> > (i.e., Web UI) but they should be finished soon. > >> > > >> > We have been running it in production for the last year and we have > >> > a paper with some of the details of our production deployment [2]. > >> > > >> > We have 4 production deployments with the largest one spanning more > >> > than 20k servers across 6 subclusters. > >> > > >> > In addition, the guys at LinkedIn had started testing Router-based > >> > federation and they will be adding security to the branch. > >> > > >> > > >> > > >> > The modifications to the rest of HDFS are minimal: > >> > > >> > - Changed visibility for some methods (e.g., MiniDFSCluster) > >> > - Added some utilities to extract addresses > >> > - Modified hdfs and hdfs.cmd to start the Router and manager the > >> > federation > >> > - Modified hdfs-default.xml > >> > > >> > Everything else is self-contained in a federation package. > >> > > >> > In addition, all the functionality is in the Router so it’s > >> > disabled by default. > >> > > >> > Even when enabled, there is no impact for regular HDFS and it would > >> > only require to configure the trust between the Namenode and the > >> > Router once security is enabled. > >> > > >> > > >> > > >> > I have been continuously rebasing the feature branch (updated up to > >> > 1 week > >> > ago) so the merge should be a straightforward cherry-pick. > >> > > >> > > >> > > >> > *Problems*: > >> > > >> > The problems I’m aware of are the following: > >> > > >> > - We implement ClientProtocol so anytime a new method is added > >> > there, we > >> > would need to add it to the Router. However, it’s > >> > straightforward to add > >> > unimplemented methods. > >> > - There is some argument about naming the feature as “Router-based > >> > federation” but I’m open for better names. > >> > > >> > > >> > > >> > *Credits*: > >> > > >> > I’d like to thank the people at Microsoft (specially, Jason, > >> > Ricardo, Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming > >> > and Gera), and LinkedIn (Zhe, Erik and Konstantin) for the discussion > and the ideas. > >> > > >> > Special thanks to Chris Douglas for the thorough reviews! > >> > > >> > > >> > > >> > Please look through the branch; feedback is welcome. Thanks! > >> > > >> > > >> > Cheers, > >> > > >> > Inigo > >> > > >> > > >> > > >> > > >> > [1] https://issues.apache.org/jira/browse/HDFS-10467 > >> > > >> > [2] https://www.usenix.org/conference/atc17/technical- > >> > sessions/presentation/misra > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > >> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >