Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-31 Thread Iñigo Goiri
Manoj, thanks for the comments.

I added the consolidated branch patch to HDFS-10467 (CC Brahma).

Regarding the comparison with existing approaches, I'd say that the real
comparison is ViewFs (already in the docs).
This is complementary to the current HDFS federation; you have multiple
namespaces and you need to aggregate them.

Regarding the best practices for the mount table, I think this is pretty
similar to what one would do in ViewFs.
Internally, what we are doing is just to have every subcluster following
the same naming as the federated namespace.
For example, if we mount /data/app1 in subcluster0, we mount it in
/data/app1 in the federated namespace.
Additionally, we are testing a Rebalancer that takes into consideration the
size of the mount table (based on the USENIX ATC paper).

I can extend the documentation in HDFS-12381.


On Thu, Aug 31, 2017 at 4:52 PM, Iñigo Goiri  wrote:

> Agreed on this not being the cleanest..
> Just filed it this morning: HDFS-12384.
>
>
> On Thu, Aug 31, 2017 at 4:36 PM, Andrew Wang 
> wrote:
>
>> v) mvn install (and package) is failing with following error
>>>
>>> [INFO]   Adding ignore: *
>>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>>> failed with message:
>>> Duplicate classes found:
>>>
>>>   Found in:
>>> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-
>>> SNAPSHOT:compile
>>> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAP
>>> SHOT:compile
>>>   Duplicate classes:
>>> org/apache/hadoop/shaded/org/apache/curator/framework/api/De
>>> leteBuilder.class
>>> org/apache/hadoop/shaded/org/apache/curator/framework/Curato
>>> rFramework.class
>>>
>>>
>>> I added "hadoop-client-minicluster" to ignore list to get success
>>>
>>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>>>
>>>   
>>> 
>>>   org.apache.hadoop
>>>   hadoop-annotations
>>>   
>>> *
>>>   
>>> 
>>> 
>>>   org.apache.hadoop
>>>   hadoop-client-minicluster
>>>   
>>> *
>>>   
>>> 
>>>
>>
>> Is there a JIRA filed for this issue? We should engage with Sean Busbey
>> on the right fix. I don't think it's right to exclude the minicluster from
>> this checking.
>>
>
>


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-31 Thread Iñigo Goiri
Agreed on this not being the cleanest..
Just filed it this morning: HDFS-12384.


On Thu, Aug 31, 2017 at 4:36 PM, Andrew Wang 
wrote:

> v) mvn install (and package) is failing with following error
>>
>> [INFO]   Adding ignore: *
>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>> failed with message:
>> Duplicate classes found:
>>
>>   Found in:
>> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-
>> SNAPSHOT:compile
>> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-
>> SNAPSHOT:compile
>>   Duplicate classes:
>> org/apache/hadoop/shaded/org/apache/curator/framework/api/De
>> leteBuilder.class
>> org/apache/hadoop/shaded/org/apache/curator/framework/Curato
>> rFramework.class
>>
>>
>> I added "hadoop-client-minicluster" to ignore list to get success
>>
>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>>
>>   
>> 
>>   org.apache.hadoop
>>   hadoop-annotations
>>   
>> *
>>   
>> 
>> 
>>   org.apache.hadoop
>>   hadoop-client-minicluster
>>   
>> *
>>   
>> 
>>
>
> Is there a JIRA filed for this issue? We should engage with Sean Busbey on
> the right fix. I don't think it's right to exclude the minicluster from
> this checking.
>


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-31 Thread Andrew Wang
>
> v) mvn install (and package) is failing with following error
>
> [INFO]   Adding ignore: *
> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
> failed with message:
> Duplicate classes found:
>
>   Found in:
> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-
> beta1-SNAPSHOT:compile
> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-
> beta1-SNAPSHOT:compile
>   Duplicate classes:
> org/apache/hadoop/shaded/org/apache/curator/framework/api/
> DeleteBuilder.class
> org/apache/hadoop/shaded/org/apache/curator/framework/
> CuratorFramework.class
>
>
> I added "hadoop-client-minicluster" to ignore list to get success
>
> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>
>   
> 
>   org.apache.hadoop
>   hadoop-annotations
>   
> *
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-minicluster
>   
> *
>   
> 
>

Is there a JIRA filed for this issue? We should engage with Sean Busbey on
the right fix. I don't think it's right to exclude the minicluster from
this checking.


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-29 Thread Manoj Govindassamy
Hi Inigo and team,

Great work guys. Good to know that you have this feature already running in 
production at a massive scale. 

1. A consolidated patch will be very useful for looking at the implementation 
details of the feature end-to-end. 
2. The design doc has good details on the feature. Do you have any other 
docs/write-ups detailing pros/cons compared to the existing HDFS Federation 
feature.
3. Any recommended/best-practices mount table configurations for the downstream 
projects?


Thanks,
Manoj G

> On Aug 28, 2017, at 8:02 PM, Iñigo Goiri <elgo...@gmail.com> wrote:
> 
> Brahma, thank you for the comments.
> i) I can send a patch with the diff between branches.
> ii) Working with Giovanni for the review.
> iii) We had some numbers in our cluster.
> iv) We could have a Router just for giving a view of all the namespaces
> without giving RPC accesses. Another case might be only allowing WebHDFS
> and not RPC. We could consolidate nevertheless.
> I will open a JIRA to extend the documentation with the configuration keys.
> v) I'm open to do more tests. I think the guys from LinkedIn wanted to test
> some more frameworks in their dev setup. In addition, before merging, I'd
> run the version in trunk for a few days.
> v) Good catches, I'll open JIRAs for those.
> 
> On Mon, Aug 28, 2017 at 6:12 AM, Brahma Reddy Battula <
> brahmareddy.batt...@huawei.com> wrote:
> 
>> Nice Feature, Great work Guys. Looking forward getting in this, as already
>> YARN federation is in.
>> 
>> At first glance I have few questions
>> 
>> i) Could have a consolidated patch for better review..?
>> 
>> ii) Hoping  "Federation Metrics" and "Federation UI" will be included.
>> 
>> iii) do we've RPC benchmarks ?
>> 
>> iv) As of now "dfs.federation.router.rpc.enable"  and
>> "dfs.federation.router.store.enable" made "true", does we need to keep
>> this configs..? since without this router might not be useful..?
>> 
>> iv) bq. The rest of the options are documented in [hdfs-default.xml]
>> I feel, better to document  all the configurations. I see, there are so
>> many, how about document in tabular format..?
>> 
>> v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks
>> you mentioned, is that enough..?
>> 
>> v) mvn install (and package) is failing with following error
>> 
>> [INFO]   Adding ignore: *
>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>> failed with message:
>> Duplicate classes found:
>> 
>>  Found in:
>>org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-
>> beta1-SNAPSHOT:compile
>>org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-
>> beta1-SNAPSHOT:compile
>>  Duplicate classes:
>>org/apache/hadoop/shaded/org/apache/curator/framework/api/
>> DeleteBuilder.class
>>org/apache/hadoop/shaded/org/apache/curator/framework/
>> CuratorFramework.class
>> 
>> 
>> I added "hadoop-client-minicluster" to ignore list to get success
>> 
>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>> 
>>  
>>
>>  org.apache.hadoop
>>  hadoop-annotations
>>  
>>*
>>  
>>
>>
>>  org.apache.hadoop
>>          hadoop-client-minicluster
>>  
>>*
>>  
>>
>> 
>> 
>> Please correct me If I am wrong.
>> 
>> 
>> --Brahma Reddy Battula
>> 
>> -Original Message-
>> From: Chris Douglas [mailto:cdoug...@apache.org]
>> Sent: 25 August 2017 06:37
>> To: Andrew Wang
>> Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; su...@apache.org
>> Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
>> 
>> On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang <andrew.w...@cloudera.com>
>> wrote:
>>> Do you mind holding this until 3.1? Same reasoning as for the other
>>> branch merge proposals, we're simply too late in the 3.0.0 release cycle.
>> 
>> That wouldn't be too dire.
>> 
>> That said, this has the same design and impact as YARN federation.
>> Specifically, it sits almost entirely outside core HDFS, so it will not
>> affect clusters running without R-BF.
>> 
>> Merging would allow the two router implementations to conve

Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-28 Thread Iñigo Goiri
Brahma, thank you for the comments.
i) I can send a patch with the diff between branches.
ii) Working with Giovanni for the review.
iii) We had some numbers in our cluster.
iv) We could have a Router just for giving a view of all the namespaces
without giving RPC accesses. Another case might be only allowing WebHDFS
and not RPC. We could consolidate nevertheless.
I will open a JIRA to extend the documentation with the configuration keys.
v) I'm open to do more tests. I think the guys from LinkedIn wanted to test
some more frameworks in their dev setup. In addition, before merging, I'd
run the version in trunk for a few days.
v) Good catches, I'll open JIRAs for those.

On Mon, Aug 28, 2017 at 6:12 AM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:

> Nice Feature, Great work Guys. Looking forward getting in this, as already
> YARN federation is in.
>
> At first glance I have few questions
>
> i) Could have a consolidated patch for better review..?
>
> ii) Hoping  "Federation Metrics" and "Federation UI" will be included.
>
> iii) do we've RPC benchmarks ?
>
> iv) As of now "dfs.federation.router.rpc.enable"  and
> "dfs.federation.router.store.enable" made "true", does we need to keep
> this configs..? since without this router might not be useful..?
>
> iv) bq. The rest of the options are documented in [hdfs-default.xml]
>  I feel, better to document  all the configurations. I see, there are so
> many, how about document in tabular format..?
>
> v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks
> you mentioned, is that enough..?
>
> v) mvn install (and package) is failing with following error
>
> [INFO]   Adding ignore: *
> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
> failed with message:
> Duplicate classes found:
>
>   Found in:
> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-
> beta1-SNAPSHOT:compile
> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-
> beta1-SNAPSHOT:compile
>   Duplicate classes:
> org/apache/hadoop/shaded/org/apache/curator/framework/api/
> DeleteBuilder.class
> org/apache/hadoop/shaded/org/apache/curator/framework/
> CuratorFramework.class
>
>
> I added "hadoop-client-minicluster" to ignore list to get success
>
> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>
>   
> 
>   org.apache.hadoop
>   hadoop-annotations
>   
> *
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-minicluster
>   
> *
>   
> 
>
>
> Please correct me If I am wrong.
>
>
> --Brahma Reddy Battula
>
> -Original Message-
> From: Chris Douglas [mailto:cdoug...@apache.org]
> Sent: 25 August 2017 06:37
> To: Andrew Wang
> Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; su...@apache.org
> Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
>
> On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
> > Do you mind holding this until 3.1? Same reasoning as for the other
> > branch merge proposals, we're simply too late in the 3.0.0 release cycle.
>
> That wouldn't be too dire.
>
> That said, this has the same design and impact as YARN federation.
> Specifically, it sits almost entirely outside core HDFS, so it will not
> affect clusters running without R-BF.
>
> Merging would allow the two router implementations to converge on a common
> backend, which has started with HADOOP-14741 [1]. If the HDFS side only
> exists in 3.1, then that work would complicate maintenance of YARN in
> 3.0.x, which may require bug fixes as it stabilizes.
>
> Merging lowers costs for maintenance with a nominal risk to stability.
> The feature is well tested, deployed, and actively developed. The
> modifications to core HDFS [2] (~23k) are trivial.
>
> So I'd still advocate for this particular merge on those merits. -C
>
> [1] https://issues.apache.org/jira/browse/HADOOP-14741
> [2] git diff --diff-filter=M $(git merge-base apache/HDFS-10467
> apache/trunk)..apache/HDFS-10467
>
> > On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas <cdoug...@apache.org>
> wrote:
> >>
> >> I'd definitely support merging this to trunk. The implementation is
> >> almost entirely outside of HDFS and, as Inigo detailed, has been
> >> tested at scale. The branch is in a functional st

RE: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-28 Thread Brahma Reddy Battula
Nice Feature, Great work Guys. Looking forward getting in this, as already YARN 
federation is in.

At first glance I have few questions

i) Could have a consolidated patch for better review..?

ii) Hoping  "Federation Metrics" and "Federation UI" will be included.

iii) do we've RPC benchmarks ?

iv) As of now "dfs.federation.router.rpc.enable"  and 
"dfs.federation.router.store.enable" made "true", does we need to keep this 
configs..? since without this router might not be useful..?

iv) bq. The rest of the options are documented in [hdfs-default.xml]
 I feel, better to document  all the configurations. I see, there are so many, 
how about document in tabular format..? 

v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks you 
mentioned, is that enough..?

v) mvn install (and package) is failing with following error

[INFO]   Adding ignore: *
[WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses failed 
with message:
Duplicate classes found:

  Found in:
org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-SNAPSHOT:compile
org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAPSHOT:compile
  Duplicate classes:

org/apache/hadoop/shaded/org/apache/curator/framework/api/DeleteBuilder.class
org/apache/hadoop/shaded/org/apache/curator/framework/CuratorFramework.class


I added "hadoop-client-minicluster" to ignore list to get success

hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml

  

  org.apache.hadoop
  hadoop-annotations
  
*
  


  org.apache.hadoop
  hadoop-client-minicluster
  
*
  



Please correct me If I am wrong.


--Brahma Reddy Battula

-Original Message-
From: Chris Douglas [mailto:cdoug...@apache.org] 
Sent: 25 August 2017 06:37
To: Andrew Wang
Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; su...@apache.org
Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang <andrew.w...@cloudera.com> wrote:
> Do you mind holding this until 3.1? Same reasoning as for the other 
> branch merge proposals, we're simply too late in the 3.0.0 release cycle.

That wouldn't be too dire.

That said, this has the same design and impact as YARN federation.
Specifically, it sits almost entirely outside core HDFS, so it will not affect 
clusters running without R-BF.

Merging would allow the two router implementations to converge on a common 
backend, which has started with HADOOP-14741 [1]. If the HDFS side only exists 
in 3.1, then that work would complicate maintenance of YARN in 3.0.x, which may 
require bug fixes as it stabilizes.

Merging lowers costs for maintenance with a nominal risk to stability.
The feature is well tested, deployed, and actively developed. The modifications 
to core HDFS [2] (~23k) are trivial.

So I'd still advocate for this particular merge on those merits. -C

[1] https://issues.apache.org/jira/browse/HADOOP-14741
[2] git diff --diff-filter=M $(git merge-base apache/HDFS-10467
apache/trunk)..apache/HDFS-10467

> On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas <cdoug...@apache.org> wrote:
>>
>> I'd definitely support merging this to trunk. The implementation is 
>> almost entirely outside of HDFS and, as Inigo detailed, has been 
>> tested at scale. The branch is in a functional state with 
>> documentation and tests. -C
>>
>> On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri <elgo...@gmail.com> wrote:
>> > Hi all,
>> >
>> >
>> >
>> > We would like to open a discussion on merging the Router-based 
>> > Federation feature to trunk.
>> >
>> > Last week, there was a thread about which branches would go into 
>> > 3.0 and given that YARN federation is going, this might be a good 
>> > time for this to be merged too.
>> >
>> >
>> > We have been running "Router-based federation" in production for a year.
>> >
>> > Meanwhile, we have been releasing it in a feature branch 
>> > (HDFS-10467
>> > [1])
>> > for a while.
>> >
>> > We are reasonably confident that the state of the branch is about 
>> > to meet the criteria to be merged onto trunk.
>> >
>> >
>> > *Feature*:
>> >
>> > This feature aggregates multiple namespaces into a single one 
>> > transparently to the user.
>> >
>> > It has a similar architecture to YARN federation (YARN-291

Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-24 Thread Andrew Wang
Do you mind holding this until 3.1? Same reasoning as for the other branch
merge proposals, we're simply too late in the 3.0.0 release cycle.

On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas  wrote:

> I'd definitely support merging this to trunk. The implementation is
> almost entirely outside of HDFS and, as Inigo detailed, has been
> tested at scale. The branch is in a functional state with
> documentation and tests. -C
>
> On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri  wrote:
> > Hi all,
> >
> >
> >
> > We would like to open a discussion on merging the Router-based Federation
> > feature to trunk.
> >
> > Last week, there was a thread about which branches would go into 3.0 and
> > given that YARN federation is going, this might be a good time for this
> to
> > be merged too.
> >
> >
> > We have been running "Router-based federation" in production for a year.
> >
> > Meanwhile, we have been releasing it in a feature branch (HDFS-10467 [1])
> > for a while.
> >
> > We are reasonably confident that the state of the branch is about to meet
> > the criteria to be merged onto trunk.
> >
> >
> > *Feature*:
> >
> > This feature aggregates multiple namespaces into a single one
> transparently
> > to the user.
> >
> > It has a similar architecture to YARN federation (YARN-2915).
> >
> > It consists on Routers that handle requests from the clients and forwards
> > them to the right subcluster and exposes the same API as the Namenode.
> >
> > Currently we use a mount table (similar to ViewFs) but can be replaced by
> > other approaches.
> >
> > The Routers share their state in a State Store.
> >
> >
> >
> > The main advantage is that clients interact with the Routers as they were
> > Namenode so there is no changes in the client required other than poiting
> > to the right address.
> >
> > In addition, all the management is moved to the server side so changes to
> > the Mount Table can be done without having to sync the clients
> (pull/push).
> >
> >
> >
> > *Status*:
> >
> > The branch already contains all the features required to work end-to-end.
> >
> > There are a couple open JIRAs that would be required for the merged
> (i.e.,
> > Web UI) but they should be finished soon.
> >
> > We have been running it in production for the last year and we have a
> paper
> > with some of the details of our production deployment [2].
> >
> > We have 4 production deployments with the largest one spanning more than
> > 20k servers across 6 subclusters.
> >
> > In addition, the guys at LinkedIn had started testing Router-based
> > federation and they will be adding security to the branch.
> >
> >
> >
> > The modifications to the rest of HDFS are minimal:
> >
> >- Changed visibility for some methods (e.g., MiniDFSCluster)
> >- Added some utilities to extract addresses
> >- Modified hdfs and hdfs.cmd to start the Router and manager the
> >federation
> >- Modified hdfs-default.xml
> >
> > Everything else is self-contained in a federation package.
> >
> > In addition, all the functionality is in the Router so it’s disabled by
> > default.
> >
> > Even when enabled, there is no impact for regular HDFS and it would only
> > require to configure the trust between the Namenode and the Router once
> > security is enabled.
> >
> >
> >
> > I have been continuously rebasing the feature branch (updated up to 1
> week
> > ago) so the merge should be a straightforward cherry-pick.
> >
> >
> >
> > *Problems*:
> >
> > The problems I’m aware of are the following:
> >
> >- We implement ClientProtocol so anytime a new method is added there,
> we
> >would need to add it to the Router. However, it’s straightforward to
> add
> >unimplemented methods.
> >- There is some argument about naming the feature as “Router-based
> >federation” but I’m open for better names.
> >
> >
> >
> > *Credits*:
> >
> > I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
> > Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), and
> > LinkedIn (Zhe, Erik and Konstantin) for the discussion and the ideas.
> >
> > Special thanks to Chris Douglas for the thorough reviews!
> >
> >
> >
> > Please look through the branch; feedback is welcome. Thanks!
> >
> >
> > Cheers,
> >
> > Inigo
> >
> >
> >
> >
> > [1] https://issues.apache.org/jira/browse/HDFS-10467
> >
> > [2] https://www.usenix.org/conference/atc17/technical-
> > sessions/presentation/misra
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-24 Thread Chris Douglas
I'd definitely support merging this to trunk. The implementation is
almost entirely outside of HDFS and, as Inigo detailed, has been
tested at scale. The branch is in a functional state with
documentation and tests. -C

On Mon, Aug 21, 2017 at 6:11 PM, Iñigo Goiri  wrote:
> Hi all,
>
>
>
> We would like to open a discussion on merging the Router-based Federation
> feature to trunk.
>
> Last week, there was a thread about which branches would go into 3.0 and
> given that YARN federation is going, this might be a good time for this to
> be merged too.
>
>
> We have been running "Router-based federation" in production for a year.
>
> Meanwhile, we have been releasing it in a feature branch (HDFS-10467 [1])
> for a while.
>
> We are reasonably confident that the state of the branch is about to meet
> the criteria to be merged onto trunk.
>
>
> *Feature*:
>
> This feature aggregates multiple namespaces into a single one transparently
> to the user.
>
> It has a similar architecture to YARN federation (YARN-2915).
>
> It consists on Routers that handle requests from the clients and forwards
> them to the right subcluster and exposes the same API as the Namenode.
>
> Currently we use a mount table (similar to ViewFs) but can be replaced by
> other approaches.
>
> The Routers share their state in a State Store.
>
>
>
> The main advantage is that clients interact with the Routers as they were
> Namenode so there is no changes in the client required other than poiting
> to the right address.
>
> In addition, all the management is moved to the server side so changes to
> the Mount Table can be done without having to sync the clients (pull/push).
>
>
>
> *Status*:
>
> The branch already contains all the features required to work end-to-end.
>
> There are a couple open JIRAs that would be required for the merged (i.e.,
> Web UI) but they should be finished soon.
>
> We have been running it in production for the last year and we have a paper
> with some of the details of our production deployment [2].
>
> We have 4 production deployments with the largest one spanning more than
> 20k servers across 6 subclusters.
>
> In addition, the guys at LinkedIn had started testing Router-based
> federation and they will be adding security to the branch.
>
>
>
> The modifications to the rest of HDFS are minimal:
>
>- Changed visibility for some methods (e.g., MiniDFSCluster)
>- Added some utilities to extract addresses
>- Modified hdfs and hdfs.cmd to start the Router and manager the
>federation
>- Modified hdfs-default.xml
>
> Everything else is self-contained in a federation package.
>
> In addition, all the functionality is in the Router so it’s disabled by
> default.
>
> Even when enabled, there is no impact for regular HDFS and it would only
> require to configure the trust between the Namenode and the Router once
> security is enabled.
>
>
>
> I have been continuously rebasing the feature branch (updated up to 1 week
> ago) so the merge should be a straightforward cherry-pick.
>
>
>
> *Problems*:
>
> The problems I’m aware of are the following:
>
>- We implement ClientProtocol so anytime a new method is added there, we
>would need to add it to the Router. However, it’s straightforward to add
>unimplemented methods.
>- There is some argument about naming the feature as “Router-based
>federation” but I’m open for better names.
>
>
>
> *Credits*:
>
> I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
> Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), and
> LinkedIn (Zhe, Erik and Konstantin) for the discussion and the ideas.
>
> Special thanks to Chris Douglas for the thorough reviews!
>
>
>
> Please look through the branch; feedback is welcome. Thanks!
>
>
> Cheers,
>
> Inigo
>
>
>
>
> [1] https://issues.apache.org/jira/browse/HDFS-10467
>
> [2] https://www.usenix.org/conference/atc17/technical-
> sessions/presentation/misra

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-21 Thread Iñigo Goiri
Hi all,



We would like to open a discussion on merging the Router-based Federation
feature to trunk.

Last week, there was a thread about which branches would go into 3.0 and
given that YARN federation is going, this might be a good time for this to
be merged too.


We have been running "Router-based federation" in production for a year.

Meanwhile, we have been releasing it in a feature branch (HDFS-10467 [1])
for a while.

We are reasonably confident that the state of the branch is about to meet
the criteria to be merged onto trunk.


*Feature*:

This feature aggregates multiple namespaces into a single one transparently
to the user.

It has a similar architecture to YARN federation (YARN-2915).

It consists on Routers that handle requests from the clients and forwards
them to the right subcluster and exposes the same API as the Namenode.

Currently we use a mount table (similar to ViewFs) but can be replaced by
other approaches.

The Routers share their state in a State Store.



The main advantage is that clients interact with the Routers as they were
Namenode so there is no changes in the client required other than poiting
to the right address.

In addition, all the management is moved to the server side so changes to
the Mount Table can be done without having to sync the clients (pull/push).



*Status*:

The branch already contains all the features required to work end-to-end.

There are a couple open JIRAs that would be required for the merged (i.e.,
Web UI) but they should be finished soon.

We have been running it in production for the last year and we have a paper
with some of the details of our production deployment [2].

We have 4 production deployments with the largest one spanning more than
20k servers across 6 subclusters.

In addition, the guys at LinkedIn had started testing Router-based
federation and they will be adding security to the branch.



The modifications to the rest of HDFS are minimal:

   - Changed visibility for some methods (e.g., MiniDFSCluster)
   - Added some utilities to extract addresses
   - Modified hdfs and hdfs.cmd to start the Router and manager the
   federation
   - Modified hdfs-default.xml

Everything else is self-contained in a federation package.

In addition, all the functionality is in the Router so it’s disabled by
default.

Even when enabled, there is no impact for regular HDFS and it would only
require to configure the trust between the Namenode and the Router once
security is enabled.



I have been continuously rebasing the feature branch (updated up to 1 week
ago) so the merge should be a straightforward cherry-pick.



*Problems*:

The problems I’m aware of are the following:

   - We implement ClientProtocol so anytime a new method is added there, we
   would need to add it to the Router. However, it’s straightforward to add
   unimplemented methods.
   - There is some argument about naming the feature as “Router-based
   federation” but I’m open for better names.



*Credits*:

I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), and
LinkedIn (Zhe, Erik and Konstantin) for the discussion and the ideas.

Special thanks to Chris Douglas for the thorough reviews!



Please look through the branch; feedback is welcome. Thanks!


Cheers,

Inigo




[1] https://issues.apache.org/jira/browse/HDFS-10467

[2] https://www.usenix.org/conference/atc17/technical-
sessions/presentation/misra