Re: [VOTE] Hadoop 3.2.x EOL

2023-12-06 Thread Iñigo Goiri
+1

On Wed, Dec 6, 2023 at 9:03 AM Chao Sun  wrote:

> +1
>
> On Wed, Dec 6, 2023 at 8:39 AM Akira Ajisaka  wrote:
> >
> > +1
> >
> >
> >
> > On Wed, Dec 6, 2023 at 1:10 PM Xiaoqiao He  wrote:
> >
> > > Dear Hadoop devs,
> > >
> > > Given the feedback from the discussion thread [1], I'd like to start
> > > an official thread for the community to vote on release line 3.2 EOL.
> > >
> > > It will include,
> > > a. An official announcement informs no further regular Hadoop 3.2.x
> > > releases.
> > > b. Issues which target 3.2.5 will not be fixed.
> > >
> > > This vote will run for 7 days and conclude by Dec 13, 2023.
> > >
> > > I’ll start with my +1.
> > >
> > > Best Regards,
> > > - He Xiaoqiao
> > >
> > > [1] https://lists.apache.org/thread/bbf546c6jz0og3xcl9l3qfjo93b65szr
> > >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


[ANNOUNCE] New Hadoop Committer - Simbarashe Dzinamarira

2023-10-02 Thread Iñigo Goiri
I am pleased to announce that Simbarashe Dzinamarira has been elected as a
committer on the Apache Hadoop project.
We appreciate all of Simbarashe's work, and look forward to his continued
contributions.

Congratulations and welcome !

Best Regards,
Inigo Goiri
(On behalf of the Apache Hadoop PMC)


Re: [DISCUSS] hadoop branch-3.3+ going to java11 only

2023-03-28 Thread Iñigo Goiri
I would also vote for targeting 3.4 and have a long term version of Java
there.

On Tue, Mar 28, 2023 at 11:52 AM Igor Dvorzhak 
wrote:

> +1 to re-focusing on 3.4 branch and upgrading it to Java 11/17, instead of
> making potentially breaking changes to 3.3.
>
> On Tue, Mar 28, 2023 at 11:17 AM Chris Nauroth  wrote:
>
>> In theory, I like the idea of setting aside Java 8. Unfortunately, I don't
>> know that upgrading within the 3.3 line adheres to our binary
>> compatibility
>> policy [1]. I don't see specific discussion of the Java version there, but
>> it states that you should be able to drop in minor upgrades and have
>> existing apps keep working. Users might find it surprising if they try to
>> upgrade a cluster that has JDK 8.
>>
>> There is also the question of impact on downstream projects [2]. We'd have
>> to check plans with our consumers.
>>
>> What about the idea of shooting for a 3.4 release on JDK 11 (or even 17)?
>> The downside is that we'd probably need to set boundaries on end of
>> life/limited support for 3.2 and 3.3 to keep the workload manageable.
>>
>> [1]
>>
>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Java_Binary_compatibility_for_end-user_applications_i.e._Apache_Hadoop_ABI
>> [2] https://github.com/apache/spark/blob/v3.3.2/pom.xml#L109
>>
>> Chris Nauroth
>>
>>
>> On Tue, Mar 28, 2023 at 11:10 AM Ayush Saxena  wrote:
>>
>> > >
>> > >  it's already hard to migrate from JDK8 why not retarget JDK17.
>> > >
>> >
>> > +1, makes sense to me, sounds like a win-win situation to me, though
>> there
>> > would be some additional issues to chase now :)
>> >
>> > -Ayush
>> >
>> >
>> > On Tue, 28 Mar 2023 at 23:29, Wei-Chiu Chuang 
>> wrote:
>> >
>> > > My random thoughts. Probably bad takes:
>> > >
>> > > There are projects experimenting with JDK17 now.
>> > > JDK11 active support will end in 6 months. If it's already hard to
>> > migrate
>> > > from JDK8 why not retarget JDK17.
>> > >
>> > > On Tue, Mar 28, 2023 at 10:30 AM Ayush Saxena 
>> > wrote:
>> > >
>> > >> I know Jersey upgrade as a blocker. Some folks were chasing that last
>> > >> year during 3.3.4 time, I don’t know where it is now, didn’t see then
>> > >> what’s the problem there but I remember there was some intitial PR
>> which
>> > >> did it for HDFS atleast, so I never looked beyond that…
>> > >>
>> > >> I too had jdk-11 in my mind, but only for trunk. 3.4.x can stay as
>> > >> java-11 only branch may be, but that is something later to decide,
>> once
>> > we
>> > >> get the code sorted…
>> > >>
>> > >> -Ayush
>> > >>
>> > >> > On 28-Mar-2023, at 9:16 PM, Steve Loughran
>> > 
>> > >> wrote:
>> > >> >
>> > >> > well, how about we flip the switch and get on with it.
>> > >> >
>> > >> > slf4j seems happy on java11,
>> > >> >
>> > >> > side issue, anyone seen test failures on zulu1.8; somehow my test
>> run
>> > is
>> > >> > failing and i'm trying to work out whether its a mismatch in
>> command
>> > >> > line/ide jvm versions, or the 3.3.5 JARs have been built with an
>> > openjdk
>> > >> > version which requires IntBuffer implements an overridden method
>> > >> IntBuffer
>> > >> > rewind().
>> > >> >
>> > >> > java.lang.NoSuchMethodError:
>> > >> java.nio.IntBuffer.rewind()Ljava/nio/IntBuffer;
>> > >> >
>> > >> > at
>> > >>
>> org.apache.hadoop.fs.FSInputChecker.verifySums(FSInputChecker.java:341)
>> > >> > at
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:308)
>> > >> > at
>> org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:257)
>> > >> > at
>> org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:202)
>> > >> > at java.io.DataInputStream.read(DataInputStream.java:149)
>> > >> >
>> > >> >> On Tue, 28 Mar 2023 at 15:52, Viraj Jasani 
>> > wrote:
>> > >> >> IIRC some of the ongoing major dependency upgrades (log4j 1 to 2,
>> > >> jersey 1
>> > >> >> to 2 and junit 4 to 5) are blockers for java 11 compile + test
>> > >> stability.
>> > >> >> On Tue, Mar 28, 2023 at 4:55 AM Steve Loughran
>> > >> > > >> >> wrote:
>> > >> >>> Now that hadoop 3.3.5 is out, i want to propose something new
>> > >> >>> we switch branch-3.3 and trunk to being java11 only
>> > >> >>> 1. java 11 has been out for years
>> > >> >>> 2. oracle java 8 is no longer available under "premier support";
>> you
>> > >> >>> can't really get upgrades
>> > >> >>>
>> > https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>> > >> >>> 3. openJDK 8 releases != oracle ones, and things you compile with
>> > them
>> > >> >>> don't always link to oracle java 8 (some classes in java.nio have
>> > >> >> added
>> > >> >>> more overrides)
>> > >> >>> 4. more and more libraries we want to upgrade to/bundle are java
>> 11
>> > >> >> only
>> > >> >>> 5. moving to java 11 would cut our yetus build workload in half,
>> and
>> > >> >>> line up for adding java 17 builds instead.
>> > >> >>> I know there are some outstanding issues still in
>> > >

Code coverage report on github PRs

2022-11-23 Thread Iñigo Goiri
Now that we are using mostly GitHub PRs for the reviews and we have decent
integration for the builds etc there, I was wondering about code coverage
and reporting.
Is code coverage setup at all?
Does this come from the INFRA team?
What would it take to enable it otherwise?


Re: [VOTE] Release Apache Hadoop 3.3.0 - RC0

2020-07-10 Thread Iñigo Goiri
+1 (Binding)

Deployed a cluster on Azure VMs with:
* 3 VMs with HDFS Namenodes and Routers
* 2 VMs with YARN Resource Managers
* 5 VMs with HDFS Datanodes and Node Managers

Tests:
* Executed Tergagen+Terasort+Teravalidate.
* Executed wordcount.
* Browsed through the Web UI.



On Fri, Jul 10, 2020 at 1:06 AM Vinayakumar B 
wrote:

> +1 (Binding)
>
> -Verified all checksums and Signatures.
> -Verified site, Release notes and Change logs
>   + May be changelog and release notes could be grouped based on the
> project at second level for better look (this needs to be supported from
> yetus)
> -Tested in x86 local 3-node docker cluster.
>   + Built from source with OpenJdk 8 and Ubuntu 18.04
>   + Deployed 3 node docker cluster
>   + Ran various Jobs (wordcount, Terasort, Pi, etc)
>
> No Issues reported.
>
> -Vinay
>
> On Fri, Jul 10, 2020 at 1:19 PM Sheng Liu  wrote:
>
> > +1 (non-binding)
> >
> > - checkout the "3.3.0-aarch64-RC0" binaries packages
> >
> > - started a clusters with 3 nodes VMs of Ubuntu 18.04 ARM/aarch64,
> > openjdk-11-jdk
> >
> > - checked some web UIs (NN, DN, RM, NM)
> >
> > - Executed a wordcount, TeraGen, TeraSort and TeraValidate
> >
> > - Executed a TestDFSIO job
> >
> > - Executed a Pi job
> >
> > BR,
> > Liusheng
> >
> > Zhenyu Zheng  于2020年7月10日周五 下午3:45写道:
> >
> > > +1 (non-binding)
> > >
> > > - Verified all hashes and checksums
> > > - Tested on ARM platform for the following actions:
> > >   + Built from source on Ubuntu 18.04, OpenJDK 8
> > >   + Deployed a pseudo cluster
> > >   + Ran some example jobs(grep, wordcount, pi)
> > >   + Ran teragen/terasort/teravalidate
> > >   + Ran TestDFSIO job
> > >
> > > BR,
> > >
> > > Zhenyu
> > >
> > > On Fri, Jul 10, 2020 at 2:40 PM Akira Ajisaka 
> > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - Verified checksums and signatures.
> > > > - Built from the source with CentOS 7 and OpenJDK 8.
> > > > - Successfully upgraded HDFS to 3.3.0-RC0 in our development cluster
> > > (with
> > > > RBF, security, and OpenJDK 11) for end-users. No issues reported.
> > > > - The document looks good.
> > > > - Deployed pseudo cluster and ran some MapReduce jobs.
> > > >
> > > > Thanks,
> > > > Akira
> > > >
> > > >
> > > > On Tue, Jul 7, 2020 at 7:27 AM Brahma Reddy Battula <
> bra...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > This is the first release candidate for the first release of Apache
> > > > > Hadoop 3.3.0
> > > > > line.
> > > > >
> > > > > It contains *1644[1]* fixed jira issues since 3.2.1 which include a
> > lot
> > > > of
> > > > > features and improvements(read the full set of release notes).
> > > > >
> > > > > Below feature additions are the highlights of the release.
> > > > >
> > > > > - ARM Support
> > > > > - Enhancements and new features on S3a,S3Guard,ABFS
> > > > > - Java 11 Runtime support and TLS 1.3.
> > > > > - Support Tencent Cloud COS File System.
> > > > > - Added security to HDFS Router.
> > > > > - Support non-volatile storage class memory(SCM) in HDFS cache
> > > directives
> > > > > - Support Interactive Docker Shell for running Containers.
> > > > > - Scheduling of opportunistic containers
> > > > > - A pluggable device plugin framework to ease vendor plugin
> > development
> > > > >
> > > > > *The RC0 artifacts are at*:
> > > > > http://home.apache.org/~brahma/Hadoop-3.3.0-RC0/
> > > > >
> > > > > *First release to include ARM binary, Have a check.*
> > > > > *RC tag is *release-3.3.0-RC0.
> > > > >
> > > > >
> > > > > *The maven artifacts are hosted here:*
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1271/
> > > > >
> > > > > *My public key is available here:*
> > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > > >
> > > > > The vote will run for 5 weekdays, until Tuesday, July 13 at 3:50 AM
> > > IST.
> > > > >
> > > > >
> > > > > I have done a few testing with my pseudo cluster. My +1 to start.
> > > > >
> > > > >
> > > > >
> > > > > Regards,
> > > > > Brahma Reddy Battula
> > > > >
> > > > >
> > > > > 1. project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> > (3.3.0)
> > > > AND
> > > > > fixVersion not in (3.2.0, 3.2.1, 3.1.3) AND status = Resolved ORDER
> > BY
> > > > > fixVersion ASC
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] GitHub PRs without JIRA number

2019-08-27 Thread Iñigo Goiri
I wouldn't go for #3 and always require a JIRA for a PR.

In general, I think we should state the best practices for using GitHub PRs.
There were some guidelines but they were kind of open
For example, adding always a link to the JIRA to the description.
I think PRs can have a template as a start.

The other thing I would do is to disable the automatic Jenkins trigger.
I've seen the "retest this" and others:
https://wiki.jenkins.io/display/JENKINS/GitHub+pull+request+builder+plugin
https://github.com/jenkinsci/ghprb-plugin/blob/master/README.md



On Tue, Aug 27, 2019 at 10:47 AM Wei-Chiu Chuang  wrote:

> Hi,
> There are hundreds of GitHub PRs pending review. Many of them just sit
> there wasting Jenkins resources.
>
> I suggest:
> (1) close PRs that went stale (i.e. doesn't compile). Or even close PRs
> that hasn't been reviewed for more than a year.
> (1) close PRs that doesn't have a JIRA number. No one is going to review a
> big PR that doesn't have a JIRA anyway.
> (2) For PRs without JIRA number, file JIRAs for the PR on behalf of the
> reporter.
> (3) For typo fixes, merge the PRs directly without a JIRA. IMO, this is the
> best use of GitHub PR.
>
> Thoughts?
>


Re: [VOTE] Force "squash and merge" option for PR merge on github UI

2019-07-17 Thread Iñigo Goiri
+1

On Wed, Jul 17, 2019 at 4:17 AM Steve Loughran 
wrote:

> +1 for squash and merge, with whoever does the merge adding the full commit
> message for the logs, with JIRA, contributor(s) etc
>
> One limit of the github process is that the author of the commit becomes
> whoever hit the squash button, not whoever did the code, so it loses the
> credit they are due. This is why I'm doing local merges (With some help
> from smart-apply-patch). I think I'll have to explore smart-apply-patch to
> see if I can do even more with it
>
>
>
>
> On Wed, Jul 17, 2019 at 7:07 AM Elek, Marton  wrote:
>
> > Hi,
> >
> > Github UI (ui!) helps to merge Pull Requests to the proposed branch.
> > There are three different ways to do it [1]:
> >
> > 1. Keep all the different commits from the PR branch and create one
> > additional merge commit ("Create a merge commit")
> >
> > 2. Squash all the commits and commit the change as one patch ("Squash
> > and merge")
> >
> > 3. Keep all the different commits from the PR branch but rebase, merge
> > commit will be missing ("Rebase and merge")
> >
> >
> >
> > As only the option 2 is compatible with the existing development
> > practices of Hadoop (1 issue = 1 patch = 1 commit), I call for a lazy
> > consensus vote: If no objections withing 3 days, I will ask INFRA to
> > disable the options 1 and 3 to make the process less error prone.
> >
> > Please let me know, what do you think,
> >
> > Thanks a lot
> > Marton
> >
> > ps: Personally I prefer to merge from local as it enables to sign the
> > commits and do a final build before push. But this is a different story,
> > this proposal is only about removing the options which are obviously
> > risky...
> >
> > ps2: You can always do any kind of merge / commits from CLI, for example
> > to merge a feature branch together with keeping the history.
> >
> > [1]:
> >
> >
> https://help.github.com/en/articles/merging-a-pull-request#merging-a-pull-request-on-github
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> >
>


Re: [DISCUSS] Merge HDFS-13891(RBF) to trunk

2019-06-10 Thread Iñigo Goiri
+1 from my side.


I also added a comment to HDFS-14268 explaining the reasons to modify
ECBlockGroupStats.
We can follow up into possible changes there if needed.

On Mon, Jun 10, 2019 at 8:10 AM Arpit Agarwal 
wrote:

> I scanned the merge payload for changes to non-RBF code. The changes are
> minimal, which is good.
>
> The only commit that I didn’t understand was:
> https://issues.apache.org/jira/browse/HDFS-14268 <
> https://issues.apache.org/jira/browse/HDFS-14268>
>
> The jira description doesn’t make it clear why ECBlockGroupStats is
> modified.
>
> +0 apart from that.
>
>
> > On Jun 1, 2019, at 8:40 PM, Brahma Reddy Battula 
> wrote:
> >
> > Dear Hadoop Developers
> >
> > I would like to propose RBF Branch (HDFS-13891) merge into trunk. We have
> > been working on this feature from last several months.
> > This feature work received the contributions from different companies.
> All
> > of the feature development happened smoothly and collaboratively in
> JIRAs.
> >
> > Kindly do take a look at the branch and raise issues/concerns that need
> to
> > be addressed before the merge.
> >
> > *Highlights of HDFS-13891 Branch:*
> > =
> >
> > Adding Security to RBF(1)
> > Adding Missing Client API's(2)
> > Improvements/Bug Fixing
> >  Critical - HDFS-13637, HDFS-13834
> >
> > *Commits:*
> > 
> >
> > No of JIRAs Resolved: 72
> >
> > All this commits are in RBF Module. No changes in hdfs/common.
> >
> > *Tested Cluster:*
> > =
> >
> > Most of these changes verified at Uber,Microsoft,Huawei and some other
> > compaines.
> >
> > *Uber*: Most changes are running in production @Uber including the
> critical
> > security changes, HDFS Clusters are 4000+ nodes with 8 HDFS Routers.
> > Zookeeper as a state store to hold delegation tokens were also stress
> > tested to hold more than 2 Million tokens. --CR Hota
> >
> > *MicroSoft*: Most of these changes are currently running in production at
> > Microsoft.The security has also been tested in a 500 server cluster with
> 4
> > subclsuters. --Inigo Goiri
> >
> > *Huawei* : Deployed all this changes in 20 node cluster with 3
> > routers.Planning deploy 10K production cluster.
> >
> > *Contributors:*
> > ===
> >
> > Many thanks to Akira Ajisaka,Mohammad Arshad,Takanobu Asanuma,Shubham
> > Dewan,CR Hota,Fei Hui,Inigo Goiri,Dibyendu Karmakar,Fengna Li,Gang
> > Li,Surendra Singh Lihore,Ranith Sardar,Ayush Saxena,He Xiaoqiao,Sherwood
> > Zheng,Daryn Sharp,VinayaKumar B,Anu Engineer for invloving discussions
> and
> > contributing to this.
> >
> > *Future Tasks:*
> > 
> >
> > will cleanup the jira's under this umbrella and contiue to work.
> >
> > Reference:
> > 1) https://issues.apache.org/jira/browse/HDFS-13532
> > 2) https://issues.apache.org/jira/browse/HDFS-13655
> >
> >
> >
> >
> > --Brahma Reddy Battula
>
>


Re: [DISCUSS] Merge HDFS-13891(RBF) to trunk

2019-06-03 Thread Iñigo Goiri
Thank you Brahma for pushing this.

As you mentioned, we have already taken most of the changes into production.
I want to highlight that the main contribution is the addition of security.
We have been able to test this at a smaller scale (~500 servers and 4
subclusters) and the performance is great with our current ZooKeeper
deployment.
I would also like to highlight that all the changes are constrained to
hadoop-hdfs-rbf and there is no differences in commons or HDFS.

+1 on merging

Inigo

On Sun, Jun 2, 2019 at 10:19 PM Akira Ajisaka  wrote:

> Thanks Brahma for starting the discussion.
> I'm +1 for merging this.
>
> FYI: In Yahoo! JAPAN, deployed all these changes in 20 nodes cluster
> with 2 routers (not in production) and running several tests.
>
> Regards,
> Akira
>
> On Sun, Jun 2, 2019 at 12:40 PM Brahma Reddy Battula 
> wrote:
> >
> > Dear Hadoop Developers
> >
> > I would like to propose RBF Branch (HDFS-13891) merge into trunk. We have
> > been working on this feature from last several months.
> > This feature work received the contributions from different companies.
> All
> > of the feature development happened smoothly and collaboratively in
> JIRAs.
> >
> > Kindly do take a look at the branch and raise issues/concerns that need
> to
> > be addressed before the merge.
> >
> > *Highlights of HDFS-13891 Branch:*
> > =
> >
> > Adding Security to RBF(1)
> > Adding Missing Client API's(2)
> > Improvements/Bug Fixing
> >   Critical - HDFS-13637, HDFS-13834
> >
> > *Commits:*
> > 
> >
> > No of JIRAs Resolved: 72
> >
> > All this commits are in RBF Module. No changes in hdfs/common.
> >
> > *Tested Cluster:*
> > =
> >
> > Most of these changes verified at Uber,Microsoft,Huawei and some other
> > compaines.
> >
> > *Uber*: Most changes are running in production @Uber including the
> critical
> > security changes, HDFS Clusters are 4000+ nodes with 8 HDFS Routers.
> > Zookeeper as a state store to hold delegation tokens were also stress
> > tested to hold more than 2 Million tokens. --CR Hota
> >
> > *MicroSoft*: Most of these changes are currently running in production at
> > Microsoft.The security has also been tested in a 500 server cluster with
> 4
> > subclsuters. --Inigo Goiri
> >
> > *Huawei* : Deployed all this changes in 20 node cluster with 3
> > routers.Planning deploy 10K production cluster.
> >
> > *Contributors:*
> > ===
> >
> > Many thanks to Akira Ajisaka,Mohammad Arshad,Takanobu Asanuma,Shubham
> > Dewan,CR Hota,Fei Hui,Inigo Goiri,Dibyendu Karmakar,Fengna Li,Gang
> > Li,Surendra Singh Lihore,Ranith Sardar,Ayush Saxena,He Xiaoqiao,Sherwood
> > Zheng,Daryn Sharp,VinayaKumar B,Anu Engineer for invloving discussions
> and
> > contributing to this.
> >
> > *Future Tasks:*
> > 
> >
> >  will cleanup the jira's under this umbrella and contiue to work.
> >
> > Reference:
> > 1) https://issues.apache.org/jira/browse/HDFS-13532
> > 2) https://issues.apache.org/jira/browse/HDFS-13655
> >
> >
> >
> >
> > --Brahma Reddy Battula
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Unprotect HDFS-13891 (HDFS RBF Branch)

2019-05-13 Thread Iñigo Goiri
Syncing the branch to trunk should be a fairly standard task.
Is there a way to do this without rebasing and forcing the push?
As far as I know this has been the standard for other branches and I don't
know of any alternative.
We should clarify the process as having to get PMC consensus to rebase a
branch seems a little overkill to me.

+1 from my side to un protect the branch to do the rebase.

On Mon, May 13, 2019, 22:46 Brahma Reddy Battula  wrote:

> Hi Folks,
>
> INFRA-18181 made all the Hadoop branches are protected.
> Badly HDFS-13891 branch needs to rebased as we contribute core patches
> trunk..So,currently we are stuck with rebase as it’s not allowed to force
> push.Hence raised INFRA-18361.
>
> Can we have a quick vote for INFRA sign-off to proceed as this is blocking
> all branch commits??
>
> --
>
>
>
> --Brahma Reddy Battula
>


Re: [VOTE]: Support for RBF data locality Solution

2019-04-11 Thread Iñigo Goiri
Thanks Hexiaoqiao for starting the vote.
As I said in the JIRA, I prefer Approach A.

I wanted to bring a broader audience as this has changes in RBF, HDFS and
Commons.
I think adding a new optional field to the RPC header should be lightweight
enough.
The idea of passing a proxied client is already available in places like
UGI but not to this level.
I haven't been able to figure other uses but maybe other applications could
take advantage of this new field.

Please, raise any concerns regarding any of the 3 approaches proposed.

On Wed, Apr 10, 2019 at 11:53 PM Akira Ajisaka  wrote:

> The Approach A looks good to me.
>
> Thanks,
> Akira
>
> On Thu, Apr 11, 2019 at 2:30 PM Xiaoqiao He  wrote:
> >
> > Hi forks,
> >
> > The current implementation of RBF is not sensitive about data locality,
> > since NameNode could not get real client hostname by invoke
> > Server#getRemoteAddress when RPC request forward by Router to NameNode.
> > Therefore, it will lead to several challenges, for instance,
> >
> >- a. Client could have to go for remote read instead of local read,
> >Short-Circuit could not be used in most cases.
> >- b. Block placement policy could not run as except based on defined
> >rack aware. Thus it will loss local node write.
> >
> > There are some different solutions to solve data locality issue after
> > discussion, some of them will change RPC protocol, so we look forward to
> > furthermore suggestions and votes. HDFS-13248 is tracking the issue.
> >
> >- Approach A: Changing IPC/RPC layer protocol
> (IpcConnectionContextProto
> >or RpcHeader#RpcRequestHeaderProto) and add extra field about client
> >hostname. Of course the new field is optional, only input by Router
> and
> >parse by Namenode in generally. This approach is compatibility and
> Client
> >should do nothing after changing.
> >- Approach B: Changing ClientProtocol and add extra interface
> >create/append/getBlockLocations with additional parameter about client
> >hostname. As approach A, it is input by Router and parse by Namenode,
> and
> >also is compatibility.
> >- Approach C: Solve write and read locality separately based on
> current
> >interface and no changes, for write, hack client hostname as one of
> favor
> >nodes for addBlocks, for read, reorder targets at Router after
> Namenode
> >returns result to Router.
> >
> > As discussion and evaluation in HDFS-13248, we prefer to change IPC/RPC
> > layer protocol to support RPC data locality. We welcome more suggestions,
> > votes or just give us feedback to push forward this feature. Thanks.
> >
> > Best Regards,
> > Hexiaoqiao
> >
> > reference
> > [1] https://issues.apache.org/jira/browse/HDFS-13248
> > [2] https://issues.apache.org/jira/browse/HDFS-10467
> >
> > [3] https://issues.apache.org/jira/browse/HDFS-12615
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Moving branch-2 precommit/nightly test builds to java 8

2019-02-05 Thread Iñigo Goiri
+1

On Tue, Feb 5, 2019 at 8:22 AM Masatake Iwasaki 
wrote:

> +1
>
> Masatake Iwasaki
>
> On 2/4/19 18:13, Jonathan Hung wrote:
> > Hello,
> >
> > Starting a vote based on the discuss thread [1] for moving branch-2
> > precommit/nightly test builds to openjdk8. After this change, the test
> > phase for precommit builds [2] and branch-2 nightly build [3] will run on
> > openjdk8. To maintain source compatibility, these builds will still run
> > their compile phase for branch-2 on openjdk7 as they do now (in addition
> to
> > compiling on openjdk8).
> >
> > Vote will run for three business days until Thursday Feb 7 6:00PM PDT.
> >
> > [1]
> >
> https://lists.apache.org/thread.html/7e6fb28fc67560f83a2eb62752df35a8d58d86b2a3df4cacb5d738ca@%3Ccommon-dev.hadoop.apache.org%3E
> >
> > [2]
> >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HADOOP-Build/
> > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-HDFS-Build/
> > https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-YARN-Build/
> >
> https://builds.apache.org/view/H-L/view/Hadoop/job/PreCommit-MAPREDUCE-Build/
> >
> > [3]
> >
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/
> >
> > Jonathan Hung
> >
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 3.2.0 - RC1

2019-01-15 Thread Iñigo Goiri
+1 (binding)

- Deployed a cluster with 3 NNs, 3 RMs and 1 DN/NM on Azure
- Tested the Active probe for the Load Balancer in front of the NNs and the
RMs
- Checked the NN, RBF, and RM Web UIs
- Executed a TeraGen, TeraSort  and TeraValidate
- Executed a YARN service with a TensorFlow app on Docker
- Scaled up the cluster to 5 DNs/NMs and executed the tests again
- Checked the HDFS output folders through the Web UI

On Tue, Jan 15, 2019 at 9:05 AM Wangda Tan  wrote:

> +1 (Binding).
>
> Deployed a local cluster from binary, and ran some sample sanity jobs.
>
> Thanks Sunil for driving the release.
>
> Best,
> Wangda
>
>
> On Mon, Jan 14, 2019 at 11:26 AM Virajith Jalaparti 
> wrote:
>
> > Thanks Sunil and others who have worked on the making this release
> happen!
> >
> > +1 (non-binding)
> >
> > - Built from source
> > - Deployed a pseudo-distributed one node cluster
> > - Ran basic wordcount, sort, pi jobs
> > - Basic HDFS/WebHDFS commands
> > - Ran all the ABFS driver tests against an ADLS Gen 2 account in EAST US
> >
> > Non-blockers (AFAICT): The following tests in ABFS (HADOOP-15407) fail:
> > - For ACLs ({{ITestAzureBlobFilesystemAcl}}) -- However, I believe these
> > have been fixed in trunk.
> > - {{
> > ITestAzureBlobFileSystemE2EScale#testWriteHeavyBytesToFileAcrossThreads}}
> > fails with an OutOfMemoryError exception. I see the same failure on trunk
> > as well.
> >
> >
> > On Mon, Jan 14, 2019 at 6:21 AM Elek, Marton  wrote:
> >
> >> Thanks Sunil to manage this release.
> >>
> >> +1 (non-binding)
> >>
> >> 1. built from the source (with clean local maven repo)
> >> 2. verified signatures + checksum
> >> 3. deployed 3 node cluster to Google Kubernetes Engine with generated
> >> k8s resources [1]
> >> 4. Executed basic HDFS commands
> >> 5. Executed basic yarn example jobs
> >>
> >> Marton
> >>
> >> [1]: FTR: resources:
> >> https://github.com/flokkr/k8s/tree/master/examples/hadoop , generator:
> >> https://github.com/elek/flekszible
> >>
> >>
> >> On 1/8/19 12:42 PM, Sunil G wrote:
> >> > Hi folks,
> >> >
> >> >
> >> > Thanks to all of you who helped in this release [1] and for helping to
> >> vote
> >> > for RC0. I have created second release candidate (RC1) for Apache
> Hadoop
> >> > 3.2.0.
> >> >
> >> >
> >> > Artifacts for this RC are available here:
> >> >
> >> > http://home.apache.org/~sunilg/hadoop-3.2.0-RC1/
> >> >
> >> >
> >> > RC tag in git is release-3.2.0-RC1.
> >> >
> >> >
> >> >
> >> > The maven artifacts are available via repository.apache.org at
> >> >
> >>
> https://repository.apache.org/content/repositories/orgapachehadoop-1178/
> >> >
> >> >
> >> > This vote will run 7 days (5 weekdays), ending on 14th Jan at 11:59 pm
> >> PST.
> >> >
> >> >
> >> >
> >> > 3.2.0 contains 1092 [2] fixed JIRA issues since 3.1.0. Below feature
> >> > additions
> >> >
> >> > are the highlights of this release.
> >> >
> >> > 1. Node Attributes Support in YARN
> >> >
> >> > 2. Hadoop Submarine project for running Deep Learning workloads on
> YARN
> >> >
> >> > 3. Support service upgrade via YARN Service API and CLI
> >> >
> >> > 4. HDFS Storage Policy Satisfier
> >> >
> >> > 5. Support Windows Azure Storage - Blob file system in Hadoop
> >> >
> >> > 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
> >> >
> >> > 7. Improvements in Router-based HDFS federation
> >> >
> >> >
> >> >
> >> > Thanks to Wangda, Vinod, Marton for helping me in preparing the
> release.
> >> >
> >> > I have done few testing with my pseudo cluster. My +1 to start.
> >> >
> >> >
> >> >
> >> > Regards,
> >> >
> >> > Sunil
> >> >
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
> >> >
> >> > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> (3.2.0)
> >> > AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status =
> Resolved
> >> > ORDER BY fixVersion ASC
> >> >
> >>
> >> -
> >> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>
> >>
>


Re: [VOTE] Release Apache Hadoop 3.2.0 - RC0

2018-11-27 Thread Iñigo Goiri
+1 (non-binding)

- Deployed a cluster with 3 NNs, 3 RMs and 1 DN/NM on Azure
- Tested the Active probe for the Load Balancer in front of the NNs and the
RMs
- Checked the NN, RBF, and RM Web UIs
- Executed a wordcount, TeraGen, TeraSort  and TeraValidate
- Executed a YARN service with a TensorFlow app on Docker
- Scaled up the cluster to 5 DNs/NMs and executed the tests again
- Checked the HDFS output folders through the Web UI



On Tue, Nov 27, 2018 at 9:37 AM Gabor Bota 
wrote:

> Thanks for the work Sunil!
>
> +1 (non-binding)
>
> I've done the following:
>
>- checked out git tag release-3.2.0-RC0
>- built from source on Mac OS X 10.14.1, java: 8.0.181-oracle
>- run hadoop-aws tests on AWS S3 against eu-west-1, with no unknown
>issues
>- deployed on a 3 node cluster
>- verified example pi job
>- ran teragen, terasort and teravalidate without any error
>
>
> Regards,
> Gabor Bota
>
> On Tue, Nov 27, 2018 at 5:56 PM Eric Badger 
> wrote:
>
> > +1 (non-binding)
> >
> > - Verified all hashes and checksums
> > - Built from source on macOS 10.14.1, Java 1.8.0u65
> > - Deployed a pseudo cluster
> > - Ran some example jobs
> >
> > Eric
> >
> > On Tue, Nov 27, 2018 at 7:02 AM Ayush Saxena  wrote:
> >
> > > Sunil, Thanks for driving this release!!!
> > >
> > > +1 (non-binding)
> > > -Built from source on Ubuntu 17.04 and JDK 8
> > > -Ran basic Hdfs Commands
> > > -Ran basic EC Commands
> > > -Ran basic RBF commands
> > > -Browsed HDFS and RBF WEB UI
> > > -Ran TeraGen/TeraSort
> > >
> > > Regards
> > > Ayush
> > >
> > > > On 27-Nov-2018, at 4:06 PM, Kitti Nanasi
>  > >
> > > wrote:
> > > >
> > > > Thanks Sunil for the driving the release!
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - checked out git tag release-3.2.0-RC0
> > > > - built from source on Mac OS X 10.13.4, java version 8.0.172-zulu
> > > > - deployed on a 5 node cluster
> > > > - ran terasort, teragen, teravalidate with success
> > > > - executed basic hdfs, dfsadmin and ec commands
> > > >
> > > > Best,
> > > > Kitti
> > > >
> > > >> On Fri, Nov 23, 2018 at 1:07 PM Sunil G  wrote:
> > > >>
> > > >> Hi folks,
> > > >>
> > > >>
> > > >>
> > > >> Thanks to all contributors who helped in this release [1]. I have
> > > created
> > > >>
> > > >> first release candidate (RC0) for Apache Hadoop 3.2.0.
> > > >>
> > > >>
> > > >> Artifacts for this RC are available here:
> > > >>
> > > >> http://home.apache.org/~sunilg/hadoop-3.2.0-RC0/
> > > >>
> > > >>
> > > >>
> > > >> RC tag in git is release-3.2.0-RC0.
> > > >>
> > > >>
> > > >>
> > > >> The maven artifacts are available via repository.apache.org at
> > > >>
> > > >>
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1174/
> > > >>
> > > >>
> > > >> This vote will run 7 days (5 weekdays), ending on Nov 30 at 11:59 pm
> > > PST.
> > > >>
> > > >>
> > > >>
> > > >> 3.2.0 contains 1079 [2] fixed JIRA issues since 3.1.0. Below feature
> > > >> additions
> > > >>
> > > >> are the highlights of this release.
> > > >>
> > > >> 1. Node Attributes Support in YARN
> > > >>
> > > >> 2. Hadoop Submarine project for running Deep Learning workloads on
> > YARN
> > > >>
> > > >> 3. Support service upgrade via YARN Service API and CLI
> > > >>
> > > >> 4. HDFS Storage Policy Satisfier
> > > >>
> > > >> 5. Support Windows Azure Storage - Blob file system in Hadoop
> > > >>
> > > >> 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
> > > >>
> > > >> 7. Improvements in Router-based HDFS federation
> > > >>
> > > >>
> > > >>
> > > >> Thanks to Wangda, Vinod, Marton for helping me in preparing the
> > release.
> > > >>
> > > >> I have done few testing with my pseudo cluster. My +1 to start.
> > > >>
> > > >>
> > > >>
> > > >> Regards,
> > > >>
> > > >> Sunil
> > > >>
> > > >>
> > > >>
> > > >> [1]
> > > >>
> > > >>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
> > > >>
> > > >> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in
> > (3.2.0)
> > > >> AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status =
> > Resolved
> > > >> ORDER BY fixVersion ASC
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >
> > >
> >
>


Re: [VOTE] Release Apache Hadoop 2.9.2 (RC0)

2018-11-19 Thread Iñigo Goiri
+1 (non-binding)
- Installed a full cluster from the tgz on Azure:
-- 2 NNs in HA.
-- 2 RMs in HA.
-- 2 Routers for RBF.
-- One worker with NM and DN.
- Verified Web UIs.
- Executed Teragen/Terasort/Teravalidate through RBF.
- Scaled up the cluster from 1 to 10 workers and executed the jobs again.


On Mon, Nov 19, 2018 at 9:19 AM Eric Payne 
wrote:

>   +1 (binding)
> -- Built from source-- Installed on 6-node pseudo cluster-- Tested intra-
> inter-queue preemption, user weights-- Ran streaming jobs, word count, and
> tara gen/sort tests
> Thanks Akira for all of the hard work.-Eric Payne
>
>
>
>
>
> On Tuesday, November 13, 2018, 7:02:51 PM CST, Akira Ajisaka <
> aajis...@apache.org> wrote:
>
>  Hi folks,
>
> I have put together a release candidate (RC0) for Hadoop 2.9.2. It
> includes 204 bug fixes and improvements since 2.9.1. [1]
>
> The RC is available at http://home.apache.org/~aajisaka/hadoop-2.9.2-RC0/
> Git signed tag is release-2.9.2-RC0 and the checksum is
> 826afbeae31ca687bc2f8471dc841b66ed2c6704
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1166/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Please try the release and vote. The vote will run for 5 days.
>
> [1] https://s.apache.org/2.9.2-fixed-jiras
>
> Thanks,
> Akira
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 3.1.0 (RC1)

2018-04-03 Thread Iñigo Goiri
+1 (non binding)

* Deployed with 4 subclusters with HDFS Router-based federation.
* Executed DistCp across subclusters through the Router
* Checked documentation and tgz

On Tue, Apr 3, 2018 at 4:25 PM, Vinod Kumar Vavilapalli 
wrote:

> We vote on the source code. The binaries are convenience artifacts.
>
> This is what I would do - (a) Just replace both the maven jars as well as
> the binaries to be consistent and correct. And then (b) Give a couple more
> days for folks who tested on the binaries to reverify - I count one such
> clear vote as of now.
>
> Thanks
> +Vinod
>
> > On Apr 3, 2018, at 3:30 PM, Wangda Tan  wrote:
> >
> > HI Arpit,
> >
> > I think it won't match if we do rebuild. It should be fine as far as
> they're signed, correct? I don't see any policy doesn't allow this.
> >
> > Thanks,
> > Wangda
> >
> >
> > On Tue, Apr 3, 2018 at 9:33 AM, Arpit Agarwal  > wrote:
> > Thanks Wangda, I see the shaded jars now.
> >
> > Are the repo jars required to be the same as the binary release? They
> don’t match right now, probably they got rebuilt.
> >
> > +1 (binding), modulo that remaining question.
> >
> > * Verified signatures
> > * Verified checksums for source and binary artefacts
> > * Sanity checked jars on r.a.o.
> > * Built from source
> > * Deployed to 3 node secure cluster with NameNode HA
> > * Verified HDFS web UIs
> > * Tried out HDFS shell commands
> > * Ran sample MapReduce jobs
> >
> > Thanks!
> >
> >
> > --
> > From: Wangda Tan mailto:wheele...@gmail.com>>
> > Date: Monday, April 2, 2018 at 9:25 PM
> > To: Arpit Agarwal  aagar...@hortonworks.com>>
> > Cc: Gera Shegalov mailto:ger...@gmail.com>>, Sunil G
> mailto:sun...@apache.org>>, "
> yarn-...@hadoop.apache.org " <
> yarn-...@hadoop.apache.org >, Hdfs-dev
> mailto:hdfs-dev@hadoop.apache.org>>, Hadoop
> Common mailto:common-...@hadoop.apache.org>>,
> "mapreduce-...@hadoop.apache.org "
> mailto:mapreduce-...@hadoop.apache.org>>,
> Vinod Kumar Vavilapalli mailto:vino...@apache.org>>
> > Subject: Re: [VOTE] Release Apache Hadoop 3.1.0 (RC1)
> >
> > As pointed by Arpit, the previously deployed shared jars are incorrect.
> Just redeployed jars and staged. @Arpit, could you please check the updated
> Maven repo? https://repository.apache.org/content/repositories/
> orgapachehadoop-1092  orgapachehadoop-1092>
> >
> > Since the jars inside binary tarballs are correct (
> http://people.apache.org/~wangda/hadoop-3.1.0-RC1/ <
> http://people.apache.org/~wangda/hadoop-3.1.0-RC1/>). I think we don't
> need roll another RC, just update Maven repo should be sufficient.
> >
> > Best,
> > Wangda
> >
> >
> > On Mon, Apr 2, 2018 at 2:39 PM, Wangda Tan  > wrote:
> > Hi Arpit,
> >
> > Thanks for pointing out this.
> >
> > I just removed all .md5 files from artifacts. I found md5 checksums
> still exist in .mds files and I didn't remove them from .mds file because
> it is generated by create-release script and Apache guidance is "should
> not" instead of "must not". Please let me know if you think they need to be
> removed as well.
> >
> > - Wangda
> >
> >
> >
> > On Mon, Apr 2, 2018 at 1:37 PM, Arpit Agarwal  aagar...@hortonworks.com > wrote:
> > Thanks for putting together this RC, Wangda.
> >
> > The guidance from Apache is to omit MD5s, specifically:
> >   > SHOULD NOT supply a MD5 checksum file (because MD5 is too broken).
> >
> > https://www.apache.org/dev/release-distribution#sigs-and-sums <
> https://www.apache.org/dev/release-distribution#sigs-and-sums>
> >
> >
> >
> >
> > On Apr 2, 2018, at 7:03 AM, Wangda Tan  > wrote:
> >
> > Hi Gera,
> >
> > It's my bad, I thought only src/bin tarball is enough.
> >
> > I just uploaded all other things under artifact/ to
> > http://people.apache.org/~wangda/hadoop-3.1.0-RC1/ <
> http://people.apache.org/~wangda/hadoop-3.1.0-RC1/>
> >
> > Please let me know if you have any other comments.
> >
> > Thanks,
> > Wangda
> >
> >
> > On Mon, Apr 2, 2018 at 12:50 AM, Gera Shegalov  > wrote:
> >
> >
> > Thanks, Wangda!
> >
> > There are many more artifacts in previous votes, e.g., see
> > http://home.apache.org/~junping_du/hadoop-2.8.3-RC0/ <
> http://home.apache.org/~junping_du/hadoop-2.8.3-RC0/> .  Among others the
> > site tarball is missing.
> >
> > On Sun, Apr 1, 2018 at 11:54 PM Sunil G  > wrote:
> >
> >
> > Thanks Wangda for initiating the release.
> >
> > I tested this RC built from source file.
> >
> >
> >   - Tested MR apps (sleep, wc) and verified both new YARN UI and old RM
> > UI.
> >   - Below fea

Re: Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64

2018-03-15 Thread Iñigo Goiri
Thank you very much Allen for making the Windows build work again.
We are going through the unit tests and fixing them for Windows (as you
said mostly paths).
We got a couple related patches in already; this will help us track
progress.

On Thu, Mar 15, 2018 at 10:15 AM, Allen Wittenauer  wrote:

>
> For my part of the HDFS bug bash, I’ve gotten the ASF Windows
> build working again. Starting tomorrow, results will be sent to the *-dev
> lists.
>
> A few notes:
>
> * It only runs the unit tests.  There’s not much point in running the
> other Yetus plugins since those are covered by the Linux one and this build
> is slow enough as it is.
>
> * There are two types of ASF build nodes: Windows Server 2012 and Windows
> Server 2016. This job can run on both and will use whichever one has a free
> slot.
>
> * It ALWAYS applies HADOOP-14667.05.patch prior to running.  As a result,
> this is only set up for trunk with no parameterization to run other
> branches.
>
> * The URI handling for file paths in hadoop-common and elsewhere is pretty
> broken on Windows, so many many many unit tests are failing and I wouldn't
> be surprised if Windows hadoop installs are horked as a result.
>
> * Runtime is about 12-13 hours with many tests taking significantly longer
> than their UNIX counterparts.  My guess is that this caused by winutils.
> Changing from winutils to Java 7 API calls would get this more in line and
> be a significant performance boost for Windows clients/servers as well.
>
> Have fun.
>
> =
>
> For more details, see https://builds.apache.org/job/hadoop-trunk-win/406/
> 
>
> [Mar 14, 2018 6:26:58 PM] (xyao) HDFS-13251. Avoid using hard coded
> datanode data dirs in unit tests.
> [Mar 14, 2018 8:05:24 PM] (jlowe) MAPREDUCE-7064. Flaky test
> [Mar 14, 2018 8:14:36 PM] (inigoiri) HDFS-13198. RBF:
> RouterHeartbeatService throws out CachedStateStore
> [Mar 14, 2018 8:36:53 PM] (wangda) Revert "HADOOP-13707. If kerberos is
> enabled while HTTP SPNEGO is not
> [Mar 14, 2018 10:47:56 PM] (fabbri) HADOOP-15278 log s3a at info.
> Contributed by Steve Loughran.
>
>
>
>
> -1 overall
>
>
> The following subsystems voted -1:
>unit
>
>
> The following subsystems are considered long running:
> (runtime bigger than 1h 00m 00s)
>unit
>
>
> Specific tests:
>
>Failed CTEST tests :
>
>   test_test_libhdfs_threaded_hdfs_static
>
>Failed junit tests :
>
>   hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec
>   hadoop.fs.contract.rawlocal.TestRawlocalContractAppend
>   hadoop.fs.TestFsShellCopy
>   hadoop.fs.TestFsShellList
>   hadoop.fs.TestLocalFileSystem
>   hadoop.http.TestHttpServer
>   hadoop.http.TestHttpServerLogs
>   hadoop.io.compress.TestCodec
>   hadoop.io.nativeio.TestNativeIO
>   hadoop.ipc.TestSocketFactory
>   hadoop.metrics2.impl.TestStatsDMetrics
>   hadoop.metrics2.sink.TestRollingFileSystemSinkWithLocal
>   hadoop.security.TestSecurityUtil
>   hadoop.security.TestShellBasedUnixGroupsMapping
>   hadoop.security.token.TestDtUtilShell
>   hadoop.util.TestNativeCodeLoader
>   hadoop.fs.TestWebHdfsFileContextMainOperations
>   hadoop.hdfs.client.impl.TestBlockReaderLocalLegacy
>   hadoop.hdfs.crypto.TestHdfsCryptoStreams
>   hadoop.hdfs.qjournal.client.TestQuorumJournalManager
>   hadoop.hdfs.qjournal.server.TestJournalNode
>   hadoop.hdfs.qjournal.server.TestJournalNodeSync
>   hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
>   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
>   hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages
>   hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks
>   hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles
>   hadoop.hdfs.server.datanode.fsdataset.impl.
> TestLazyPersistLockedMemory
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy
>   hadoop.hdfs.server.datanode.fsdataset.impl.
> TestLazyPersistReplicaPlacement
>   hadoop.hdfs.server.datanode.fsdataset.impl.
> TestLazyPersistReplicaRecovery
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation
>   hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica
>   hadoop.hdfs.server.datanode.TestBlockPoolSliceStorage
>   hadoop.hdfs.server.datanode.TestBlockRecovery
>   hadoop.hdfs.server.datanode.TestBlockScanner
>   hadoop.hdfs.server.datanode.TestDataNodeFaultInjector
>   hadoop.hdfs.server.datanode.TestDataNodeMetrics
>   hadoop.hdfs.server.datanode.TestDataNodeUUID
>   hadoop.hdfs.server.datanode.TestDataNodeVolu

Re: [VOTE] Merge HDFS-9806 to trunk

2017-12-13 Thread Iñigo Goiri
+1
I have been reviewing some of the latest patches.
I skimmed through the patch in HDFS-9806 and it looks good.

In addition, we have ported it to 2.7.1 (minor differences to what would be
merged).
It has been running in our test cluster for a couple months.
All the issues we have been finding are already resolved and committed to
the feature branch.
After this, we have recently deployed to three production clusters and is
working as expected so far.

Thanks for the work Virajith and Chris; I'd like to see this merged into
trunk to make the maintainability easier.


On Wed, Dec 13, 2017 at 12:01 PM, Sean Mackrory 
wrote:

> +1 from me. There are some unrelated errors building the branch right now
> due to annotations in some YARN code, etc. but I was able to generate an fs
> image from an S3 bucket and serve the content through HDFS on a
> pseudo-distributed HDFS node this morning. Seems like a good point for a
> merge.
>
> On Wed, Dec 13, 2017 at 11:55 AM, Anu Engineer 
> wrote:
>
> > Hi Virajith / Chris/ Thomas / Ewan,
> >
> > Thanks for developing this feature and getting to merge state.
> > I would like to vote +1 for this merge. Thanks for all the hard work.
> >
> > Thanks
> > Anu
> >
> >
> > On 12/8/17, 7:11 PM, "Virajith Jalaparti"  wrote:
> >
> > Hi,
> >
> > We have tested the HDFS-9806 branch in two settings:
> >
> > (i) 26 node bare-metal cluster, with PROVIDED storage configured to
> > point
> > to another instance of HDFS (containing 468 files, total of ~400GB of
> > data). Half of the Datanodes are configured with only DISK volumes
> and
> > other other half have both DISK and PROVIDED volumes.
> > (ii) 8 VMs on Azure, with PROVIDED storage configured to point to a
> > WASB
> > account (containing 26,074 files and ~1.3TB of data). All Datanodes
> are
> > configured with DISK and PROVIDED volumes.
> >
> > (i) was tested using both the text-based alias map
> > (TextFileRegionAliasMap)
> > and the in-memory leveldb-based alias map (
> > InMemoryLevelDBAliasMapClient),
> > while (ii) was tested using the text-based alias map only.
> >
> > Steps followed:
> > (0) Build from apache/HDFS-9806. (Note that for the leveldb-based
> alias
> > map, the patch posted to HDFS-12912
> >  needs to be
> > applied; we
> > will commit this to apache/HDFS-9806 after review).
> > (1) Generate the FSImage using the image generation tool with the
> > appropriate remote location (hdfs:// in (i) and wasb:// in (ii)).
> > (2) Bring up the HDFS cluster.
> > (3) Verify that the remote namespace is reflected correctly and data
> on
> > remote store can be accessed. Commands ran: ls, copyToLocal, fsck,
> > getrep,
> > setrep, getStoragePolicy
> > (4) Run Sort and Gridmix jobs on the data in the remote location with
> > the
> > input paths pointing to the local HDFS.
> > (5) Increase replication of the PROVIDED files and verified that
> local
> > (DISK) replicas were created for the PROVIDED replicas, using fsck.
> > (6) Verify that Provided storage capacity is shown correctly on the
> NN
> > and
> > Datanode Web-UI.
> > (7) Bring down datanodes, one by one. When all are down, verify NN
> > reports
> > all PROVIDED files as missing. Bringing back up any one Datanode
> makes
> > all
> > the data available.
> > (8) Restart NN and verify data is still accesible.
> > (9) Verify that Writes to local HDFS continue to work.
> > (10) Bring down all Datanodes except one. Start decommissioning the
> > remaining Datanode. Verify that the data in the PROVIDED storage is
> > still
> > accessible.
> >
> > Apart from the above, we ported the changes in HDFS-9806 to
> branch-2.7
> > and
> > deployed it on a ~800 node cluster as one of the sub-clusters in a
> > Router-based Federated HDFS of nearly 4000 nodes (with help from
> Inigo
> > Goiri). We mounted about 1000 files, 650TB of remote data
> (~2.6million
> > blocks with 256MB block size) in this cluster using the text-based
> > alias
> > map. We verified that the basic commands (ls, copyToLocal, setrep)
> > work.
> > We also ran spark jobs against this cluster.
> >
> > -Virajith
> >
> >
> > On Fri, Dec 8, 2017 at 3:44 PM, Chris Douglas 
> > wrote:
> >
> > > Discussion thread: https://s.apache.org/kxT1
> > >
> > > We're down to the last few issues and are preparing the branch to
> > > merge to trunk. We'll post merge patches to HDFS-9806 [1]. Minor,
> > > "cleanup" tasks (checkstyle, findbugs, naming, etc.) will be
> tracked
> > > in HDFS-12712 [2].
> > >
> > > We've tried to ensure that when this feature is disabled, HDFS is
> > > unaffected. For those reviewing this, please look for places where
> > > this might add overheads and we'll address them before the merge.
> The
> > > site documentation [3] and design do

Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-12 Thread Iñigo Goiri
+1 (non-binding)

I tested it in a deployment with 24 nodes across 8 subclusters.
Tested a few jobs reading and writing data through HDFS Router-based
federation.
However, jobs failed to run when setting RBF as the default filesystem
because after MAPREDUCE-6954, it tries to invoke setErasureCodingPolicy
while is not implemented.
I filed HDFS-12919 to track this but I don't think is a blocker.

Thanks,
Inigo

On Tue, Dec 12, 2017 at 2:43 PM, Elek, Marton  wrote:

> +1 (non-binding)
>
>  * built from the source tarball (archlinux) / verified signature
>  * Deployed to a kubernetes cluster (10/10 datanode/nodemanager pods)
>  * Enabled ec on hdfs directory (hdfs cli)
>  * Started example yarn jobs (pi/terragen)
>  * checked yarn ui/ui2
>
> Thanks for all the efforts.
>
> Marton
>
>
>
> On 12/08/2017 09:31 PM, Andrew Wang wrote:
>
>> Hi all,
>>
>> Let me start, as always, by thanking the efforts of all the contributors
>> who contributed to this release, especially those who jumped on the issues
>> found in RC0.
>>
>> I've prepared RC1 for Apache Hadoop 3.0.0. This release incorporates 302
>> fixed JIRAs since the previous 3.0.0-beta1 release.
>>
>> You can find the artifacts here:
>>
>> http://home.apache.org/~wang/3.0.0-RC1/
>>
>> I've done the traditional testing of building from the source tarball and
>> running a Pi job on a single node cluster. I also verified that the shaded
>> jars are not empty.
>>
>> Found one issue that create-release (probably due to the mvn deploy
>> change)
>> didn't sign the artifacts, but I fixed that by calling mvn one more time.
>> Available here:
>>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1075/
>>
>> This release will run the standard 5 days, closing on Dec 13th at 12:31pm
>> Pacific. My +1 to start.
>>
>> Best,
>> Andrew
>>
>>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)

2017-11-15 Thread Iñigo Goiri
+1 (non-binding)

Deployed in a cluster with 48 nodes and 8 subclusters:

   - YARN federation
   - HDFS Router-based federation
   - Yarn UI 2

Executed a few Pi job using both HDFS and YARN federation.
Everything worked correctly.
The YARN UI 2 showed the jobs, etc.

On Wed, Nov 15, 2017 at 2:05 PM, Chris Douglas  wrote:

> +1 (binding)
>
> Verified source tarball. Checksum and signature match, built from
> source, ran some unit tests. Skimmed NOTICE/LICENSE. -C
>
> On Mon, Nov 13, 2017 at 4:10 PM, Arun Suresh  wrote:
> > Hi Folks,
> >
> > Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be
> the
> > starting release for Apache Hadoop 2.9.x line - it includes 30 New
> Features
> > with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues
> since
> > 2.8.2.
> >
> > More information about the 2.9.0 release plan can be found here:
> > *https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9
> >  Roadmap#Roadmap-Version2.9>*
> >
> > New RC is available at: *https://home.apache.org/~
> asuresh/hadoop-2.9.0-RC3/
> > *
> >
> > The RC tag in git is: release-2.9.0-RC3, and the latest commit id is:
> > 756ebc8394e473ac25feac05fa493f6d612e6c50.
> >
> > The maven artifacts are available via repository.apache.org at:
> >  apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D&
> sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https://
> repository.apache.org/content/repositories/orgapachehadoop-1068/
> >  orgapachehadoop-1068/>*
> >
> > We are carrying over the votes from the previous RC given that the delta
> is
> > the license fix.
> >
> > Given the above - we are also going to stick with the original deadline
> for
> > the vote : ending on Friday 17th November 2017 2pm PT time.
> >
> > Thanks,
> > -Arun/Subru
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Merge Router-Based Federation (HDFS-10467) branch into trunk/branch-3

2017-10-06 Thread Iñigo Goiri
The 7 days has passed, and we got 3 +1s and no -1.
So the merge passes.
I'll go ahead with the merge.

It's a total of 21 patches.
Tested it locally and it works as expected with the latest trunk and
branch-3.
I'll go ahead and merge to both.

I'll close HDFS-10467 and start a new umbrella for the second phase.


Thanks for the votes!


On Tue, Oct 3, 2017 at 9:50 AM, Brahma Reddy Battula 
wrote:

> Thanks Inigo.
>
> +1 (binding)
>
> Nice feature.Involved in reviewing some jiras.
>
> On Sat, 30 Sep 2017 at 12:29 AM, Iñigo Goiri  wrote:
>
> > Hi all,
> >
> > Given that 3.0-beta1 is already cut, I’d like to call a vote for merging
> > Router-Based Federation (HDFS-10467) to trunk and branch-3.
> >
> > The vote will run for 7 days as usual.
> >
> >
> >
> > We started the discussion about merging HDFS-10467 a few weeks ago [1]
> and
> > got good feedback which we’ve incorporated already [2, 3, 4].
> >
> > There are a couple tasks left:
> >
> >- HDFS-12273 for the UI. This should be completed in the next couple
> >days.
> >- HDFS-12284 for adding security. We can move this for v2 if not
> >completed.
> >
> > We have deployed this in production for 2.7 and we did a few tests with
> > trunk a few months ago.
> >
> > This week, I’m rebasing to trunk (last one was a couple weeks ago) and
> test
> > trunk in one of our test clusters.
> >
> >
> > Finally, note that all the functionality is in the Router (a new
> component)
> > so everything is isolated.
> >
> > In addition, no new APIs have been added and we rely fully in
> > ClientProtocol.
> >
> >
> >
> > I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
> > Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera),
> LinkedIn
> > (Zhe, Erik and Konstantin), and Cloudera (Andrew and Manoj) for
> > the discussion and the ideas.
> >
> > Special thanks to Chris Douglas for the thorough reviews!
> >
> >
> >
> > Regards,
> > Inigo
> >
> >
> >
> > [1]
> >
> > http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201708.mbox/%
> 3CCAB1dGgogTu6kHtkkYeUycmNv-H3RupfPF4Cd7rpuFi6vHGdBLg%40mail.gmail.com%3E
> >
> > [2] https://issues.apache.org/jira/browse/HDFS-12381
> >
> > [3] https://issues.apache.org/jira/browse/HDFS-12430
> >
> > [4] https://issues.apache.org/jira/browse/HDFS-12450
> >
> --
>
>
>
> --Brahma Reddy Battula
>


[VOTE] Merge Router-Based Federation (HDFS-10467) branch into trunk/branch-3

2017-09-29 Thread Iñigo Goiri
Hi all,

Given that 3.0-beta1 is already cut, I’d like to call a vote for merging
Router-Based Federation (HDFS-10467) to trunk and branch-3.

The vote will run for 7 days as usual.



We started the discussion about merging HDFS-10467 a few weeks ago [1] and
got good feedback which we’ve incorporated already [2, 3, 4].

There are a couple tasks left:

   - HDFS-12273 for the UI. This should be completed in the next couple
   days.
   - HDFS-12284 for adding security. We can move this for v2 if not
   completed.

We have deployed this in production for 2.7 and we did a few tests with
trunk a few months ago.

This week, I’m rebasing to trunk (last one was a couple weeks ago) and test
trunk in one of our test clusters.


Finally, note that all the functionality is in the Router (a new component)
so everything is isolated.

In addition, no new APIs have been added and we rely fully in
ClientProtocol.



I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), LinkedIn
(Zhe, Erik and Konstantin), and Cloudera (Andrew and Manoj) for
the discussion and the ideas.

Special thanks to Chris Douglas for the thorough reviews!



Regards,
Inigo



[1]
http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201708.mbox/%3CCAB1dGgogTu6kHtkkYeUycmNv-H3RupfPF4Cd7rpuFi6vHGdBLg%40mail.gmail.com%3E

[2] https://issues.apache.org/jira/browse/HDFS-12381

[3] https://issues.apache.org/jira/browse/HDFS-12430

[4] https://issues.apache.org/jira/browse/HDFS-12450


Re: [DISCUSS] Looking to a 2.9.0 release

2017-09-07 Thread Iñigo Goiri
Hi Subru,
We are also discussing the merge of HDFS-10467 (Router-based federation)
and we would like to target 2.9 to do a full release together with YARN
federation.
Chris Douglas already arranged the integration into trunk for 3.0.0 GA.

Regarding the points to cover:
1. API compatibility: we just extend ClientProtocol so no changes in the
API.
2. Turning feature off: if the Router is not started, the feature is
disabled completely.
3. Stability/testing: the internal version is heavily tested. We will start
testing the OSS version soon. In any case, the feature is isolated and
minor bugs will not affect anybody else other than the users of the feature.
4. Deployment: we are currently using 2.7.1 and we would like to switch to
2.9 when available.
5. Timeline: finishing the UI and the security JIRAs in HDFS-10467 should
give us a ready to use version. There will be small features added but
nothing major. There are a couple minor issues with the merge
(e.g., HDFS-12384) but should be worked out soon.

Thanks,
Inigo


On Tue, Sep 5, 2017 at 4:26 PM, Jonathan Hung  wrote:

> Hi Subru,
>
> Thanks for starting the discussion. We are targeting merging YARN-5734
> (API-based scheduler configuration) to branch-2 before the release of
> 2.9.0, since the feature is close to complete. Regarding the requirements
> for merge,
>
> 1. API compatibility - this feature adds new APIs, does not modify any
> existing ones.
> 2. Turning feature off - using the feature is configurable and is turned
> off by default.
> 3. Stability/testing - this is an RM-only change, so we plan on deploying
> this feature to a test RM and verifying configuration changes for capacity
> scheduler. (Right now fair scheduler is not supported.)
> 4. Deployment - we want to get this feature in to 2.9.0 since we want to
> use this feature and 2.9 version in our next upgrade.
> 5. Timeline - we have one main blocker which we are planning to resolve by
> end of week. The rest of the month will be testing then a merge vote on the
> last week of Sept.
>
> Please let me know if you have any concerns. Thanks!
>
>
> Jonathan Hung
>
> On Wed, Jul 26, 2017 at 11:23 AM, J. Rottinghuis 
> wrote:
>
> > Thanks Vrushali for being entirely open as to the current status of
> ATSv2.
> > I appreciate that we want to ensure things are tested at scale, and as
> you
> > said we are working on that right now on our clusters.
> > We have tested the feature to demonstrate it works at what we consider
> > moderate scale.
> >
> > I think the criteria for including this feature in the 2.9 release should
> > be if it can be safely turned off and not cause impact to anybody not
> using
> > the new feature. The confidence for this is high for timeline service v2.
> >
> > Therefore, I think timeline service v2 should definitely be part of 2.9.
> > That is the big draw for us to work on stabilizing a 2.9 release rather
> > than just going to 2.8 and back-porting things ourselves.
> >
> > Thanks,
> >
> > Joep
> >
> > On Tue, Jul 25, 2017 at 11:39 PM, Vrushali Channapattan <
> > vrushalic2...@gmail.com> wrote:
> >
> > > Thanks Subru for initiating this discussion.
> > >
> > > Wanted to share some thoughts in the context of Timeline Service v2.
> The
> > > current status of this module is that we are ramping up for a second
> > merge
> > > to trunk. We still have a few merge blocker jiras outstanding, which we
> > > think we will finish soon.
> > >
> > > While we have done some testing, we are yet to test at scale. Given all
> > > this, we were thinking of initially targeting a beta release vehicle
> > rather
> > > than a stable release.
> > >
> > > As such, timeline service v2 has branch-2 branch called as
> > > YARN-5355-branch-2 in case anyone wants to try it out. Timeline service
> > v2
> > > can be turned off and should not affect the cluster.
> > >
> > > thanks
> > > Vrushali
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jul 24, 2017 at 1:26 PM, Subru Krishnan 
> > wrote:
> > >
> > > > Folks,
> > > >
> > > > With the release for 2.8, we would like to look ahead to 2.9 release
> as
> > > > there are many features/improvements in branch-2 (about 1062
> commits),
> > > that
> > > > are in need of a release vechile.
> > > >
> > > > Here's our first cut of the proposal from the YARN side:
> > > >
> > > >1. Scheduler improvements (decoupling allocation from node
> > heartbeat,
> > > >allocation ID, concurrency fixes, LightResource etc).
> > > >2. Timeline Service v2
> > > >3. Opportunistic containers
> > > >4. Federation
> > > >
> > > > We would like to hear a formal list from HDFS & Hadoop (& MapReduce
> if
> > > any)
> > > > and will update the Roadmap wiki accordingly.
> > > >
> > > > Considering our familiarity with the above mentioned YARN features,
> we
> > > > would like to volunteer as the co-RMs for 2.9.0.
> > > >
> > > > We want to keep the timeline at 8-12 weeks to keep the release
> > pragmatic.
> > > >
> > > > Feedback?
> > > >
> > > > -Subru/Arun
> > > >

Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-31 Thread Iñigo Goiri
Manoj, thanks for the comments.

I added the consolidated branch patch to HDFS-10467 (CC Brahma).

Regarding the comparison with existing approaches, I'd say that the real
comparison is ViewFs (already in the docs).
This is complementary to the current HDFS federation; you have multiple
namespaces and you need to aggregate them.

Regarding the best practices for the mount table, I think this is pretty
similar to what one would do in ViewFs.
Internally, what we are doing is just to have every subcluster following
the same naming as the federated namespace.
For example, if we mount /data/app1 in subcluster0, we mount it in
/data/app1 in the federated namespace.
Additionally, we are testing a Rebalancer that takes into consideration the
size of the mount table (based on the USENIX ATC paper).

I can extend the documentation in HDFS-12381.


On Thu, Aug 31, 2017 at 4:52 PM, Iñigo Goiri  wrote:

> Agreed on this not being the cleanest..
> Just filed it this morning: HDFS-12384.
>
>
> On Thu, Aug 31, 2017 at 4:36 PM, Andrew Wang 
> wrote:
>
>> v) mvn install (and package) is failing with following error
>>>
>>> [INFO]   Adding ignore: *
>>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>>> failed with message:
>>> Duplicate classes found:
>>>
>>>   Found in:
>>> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-
>>> SNAPSHOT:compile
>>> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-SNAP
>>> SHOT:compile
>>>   Duplicate classes:
>>> org/apache/hadoop/shaded/org/apache/curator/framework/api/De
>>> leteBuilder.class
>>> org/apache/hadoop/shaded/org/apache/curator/framework/Curato
>>> rFramework.class
>>>
>>>
>>> I added "hadoop-client-minicluster" to ignore list to get success
>>>
>>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>>>
>>>   
>>> 
>>>   org.apache.hadoop
>>>   hadoop-annotations
>>>   
>>> *
>>>   
>>> 
>>> 
>>>   org.apache.hadoop
>>>   hadoop-client-minicluster
>>>   
>>> *
>>>   
>>> 
>>>
>>
>> Is there a JIRA filed for this issue? We should engage with Sean Busbey
>> on the right fix. I don't think it's right to exclude the minicluster from
>> this checking.
>>
>
>


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-31 Thread Iñigo Goiri
Agreed on this not being the cleanest..
Just filed it this morning: HDFS-12384.


On Thu, Aug 31, 2017 at 4:36 PM, Andrew Wang 
wrote:

> v) mvn install (and package) is failing with following error
>>
>> [INFO]   Adding ignore: *
>> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
>> failed with message:
>> Duplicate classes found:
>>
>>   Found in:
>> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-beta1-
>> SNAPSHOT:compile
>> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-beta1-
>> SNAPSHOT:compile
>>   Duplicate classes:
>> org/apache/hadoop/shaded/org/apache/curator/framework/api/De
>> leteBuilder.class
>> org/apache/hadoop/shaded/org/apache/curator/framework/Curato
>> rFramework.class
>>
>>
>> I added "hadoop-client-minicluster" to ignore list to get success
>>
>> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>>
>>   
>> 
>>   org.apache.hadoop
>>   hadoop-annotations
>>   
>> *
>>   
>> 
>> 
>>   org.apache.hadoop
>>   hadoop-client-minicluster
>>   
>> *
>>   
>> 
>>
>
> Is there a JIRA filed for this issue? We should engage with Sean Busbey on
> the right fix. I don't think it's right to exclude the minicluster from
> this checking.
>


Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-28 Thread Iñigo Goiri
Brahma, thank you for the comments.
i) I can send a patch with the diff between branches.
ii) Working with Giovanni for the review.
iii) We had some numbers in our cluster.
iv) We could have a Router just for giving a view of all the namespaces
without giving RPC accesses. Another case might be only allowing WebHDFS
and not RPC. We could consolidate nevertheless.
I will open a JIRA to extend the documentation with the configuration keys.
v) I'm open to do more tests. I think the guys from LinkedIn wanted to test
some more frameworks in their dev setup. In addition, before merging, I'd
run the version in trunk for a few days.
v) Good catches, I'll open JIRAs for those.

On Mon, Aug 28, 2017 at 6:12 AM, Brahma Reddy Battula <
brahmareddy.batt...@huawei.com> wrote:

> Nice Feature, Great work Guys. Looking forward getting in this, as already
> YARN federation is in.
>
> At first glance I have few questions
>
> i) Could have a consolidated patch for better review..?
>
> ii) Hoping  "Federation Metrics" and "Federation UI" will be included.
>
> iii) do we've RPC benchmarks ?
>
> iv) As of now "dfs.federation.router.rpc.enable"  and
> "dfs.federation.router.store.enable" made "true", does we need to keep
> this configs..? since without this router might not be useful..?
>
> iv) bq. The rest of the options are documented in [hdfs-default.xml]
>  I feel, better to document  all the configurations. I see, there are so
> many, how about document in tabular format..?
>
> v) Downstream projects (Spark,HBASE,HIVE..) integration testing..? looks
> you mentioned, is that enough..?
>
> v) mvn install (and package) is failing with following error
>
> [INFO]   Adding ignore: *
> [WARNING] Rule 1: org.apache.maven.plugins.enforcer.BanDuplicateClasses
> failed with message:
> Duplicate classes found:
>
>   Found in:
> org.apache.hadoop:hadoop-client-minicluster:jar:3.0.0-
> beta1-SNAPSHOT:compile
> org.apache.hadoop:hadoop-client-runtime:jar:3.0.0-
> beta1-SNAPSHOT:compile
>   Duplicate classes:
> org/apache/hadoop/shaded/org/apache/curator/framework/api/
> DeleteBuilder.class
> org/apache/hadoop/shaded/org/apache/curator/framework/
> CuratorFramework.class
>
>
> I added "hadoop-client-minicluster" to ignore list to get success
>
> hadoop\hadoop-client-modules\hadoop-client-integration-tests\pom.xml
>
>   
> 
>   org.apache.hadoop
>   hadoop-annotations
>   
> *
>   
> 
> 
>   org.apache.hadoop
>   hadoop-client-minicluster
>   
>     *
>   
> 
>
>
> Please correct me If I am wrong.
>
>
> --Brahma Reddy Battula
>
> -Original Message-
> From: Chris Douglas [mailto:cdoug...@apache.org]
> Sent: 25 August 2017 06:37
> To: Andrew Wang
> Cc: Iñigo Goiri; hdfs-dev@hadoop.apache.org; su...@apache.org
> Subject: Re: [DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk
>
> On Thu, Aug 24, 2017 at 2:25 PM, Andrew Wang 
> wrote:
> > Do you mind holding this until 3.1? Same reasoning as for the other
> > branch merge proposals, we're simply too late in the 3.0.0 release cycle.
>
> That wouldn't be too dire.
>
> That said, this has the same design and impact as YARN federation.
> Specifically, it sits almost entirely outside core HDFS, so it will not
> affect clusters running without R-BF.
>
> Merging would allow the two router implementations to converge on a common
> backend, which has started with HADOOP-14741 [1]. If the HDFS side only
> exists in 3.1, then that work would complicate maintenance of YARN in
> 3.0.x, which may require bug fixes as it stabilizes.
>
> Merging lowers costs for maintenance with a nominal risk to stability.
> The feature is well tested, deployed, and actively developed. The
> modifications to core HDFS [2] (~23k) are trivial.
>
> So I'd still advocate for this particular merge on those merits. -C
>
> [1] https://issues.apache.org/jira/browse/HADOOP-14741
> [2] git diff --diff-filter=M $(git merge-base apache/HDFS-10467
> apache/trunk)..apache/HDFS-10467
>
> > On Thu, Aug 24, 2017 at 1:39 PM, Chris Douglas 
> wrote:
> >>
> >> I'd definitely support merging this to trunk. The implementation is
> >> almost entirely outside of HDFS and, as Inigo detailed, has been
> >> tested at scale. The branch is in a functional state with
> 

[DISCUSS] Merge HDFS-10467 to (Router-based federation) trunk

2017-08-21 Thread Iñigo Goiri
Hi all,



We would like to open a discussion on merging the Router-based Federation
feature to trunk.

Last week, there was a thread about which branches would go into 3.0 and
given that YARN federation is going, this might be a good time for this to
be merged too.


We have been running "Router-based federation" in production for a year.

Meanwhile, we have been releasing it in a feature branch (HDFS-10467 [1])
for a while.

We are reasonably confident that the state of the branch is about to meet
the criteria to be merged onto trunk.


*Feature*:

This feature aggregates multiple namespaces into a single one transparently
to the user.

It has a similar architecture to YARN federation (YARN-2915).

It consists on Routers that handle requests from the clients and forwards
them to the right subcluster and exposes the same API as the Namenode.

Currently we use a mount table (similar to ViewFs) but can be replaced by
other approaches.

The Routers share their state in a State Store.



The main advantage is that clients interact with the Routers as they were
Namenode so there is no changes in the client required other than poiting
to the right address.

In addition, all the management is moved to the server side so changes to
the Mount Table can be done without having to sync the clients (pull/push).



*Status*:

The branch already contains all the features required to work end-to-end.

There are a couple open JIRAs that would be required for the merged (i.e.,
Web UI) but they should be finished soon.

We have been running it in production for the last year and we have a paper
with some of the details of our production deployment [2].

We have 4 production deployments with the largest one spanning more than
20k servers across 6 subclusters.

In addition, the guys at LinkedIn had started testing Router-based
federation and they will be adding security to the branch.



The modifications to the rest of HDFS are minimal:

   - Changed visibility for some methods (e.g., MiniDFSCluster)
   - Added some utilities to extract addresses
   - Modified hdfs and hdfs.cmd to start the Router and manager the
   federation
   - Modified hdfs-default.xml

Everything else is self-contained in a federation package.

In addition, all the functionality is in the Router so it’s disabled by
default.

Even when enabled, there is no impact for regular HDFS and it would only
require to configure the trust between the Namenode and the Router once
security is enabled.



I have been continuously rebasing the feature branch (updated up to 1 week
ago) so the merge should be a straightforward cherry-pick.



*Problems*:

The problems I’m aware of are the following:

   - We implement ClientProtocol so anytime a new method is added there, we
   would need to add it to the Router. However, it’s straightforward to add
   unimplemented methods.
   - There is some argument about naming the feature as “Router-based
   federation” but I’m open for better names.



*Credits*:

I’d like to thank the people at Microsoft (specially, Jason, Ricardo,
Chris, Subru, Jakob, Carlo and Giovanni), Twitter (Ming and Gera), and
LinkedIn (Zhe, Erik and Konstantin) for the discussion and the ideas.

Special thanks to Chris Douglas for the thorough reviews!



Please look through the branch; feedback is welcome. Thanks!


Cheers,

Inigo




[1] https://issues.apache.org/jira/browse/HDFS-10467

[2] https://www.usenix.org/conference/atc17/technical-
sessions/presentation/misra