Re: [VOTE] Moving Ozone to a separated Apache project

2020-09-25 Thread anu engineer
+1.
--Anu


On Thu, Sep 24, 2020 at 10:59 PM Elek, Marton  wrote:

> Hi all,
>
> Thank you for all the feedback and requests,
>
> As we discussed in the previous thread(s) [1], Ozone is proposed to be a
> separated Apache Top Level Project (TLP)
>
> The proposal with all the details, motivation and history is here:
>
>
> https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal
>
> This voting runs for 7 days and will be concluded at 2nd of October, 6AM
> GMT.
>
> Thanks,
> Marton Elek
>
> [1]:
>
> https://lists.apache.org/thread.html/rc6c79463330b3e993e24a564c6817aca1d290f186a1206c43ff0436a%40%3Chdfs-dev.hadoop.apache.org%3E
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] making Ozone a separate Apache project

2020-05-13 Thread anu engineer
+1
—Anu

> On May 13, 2020, at 12:53 AM, Elek, Marton  wrote:
> 
> 
> 
> I would like to start a discussion to make a separate Apache project for Ozone
> 
> 
> 
> ### HISTORY [1]
> 
> * Apache Hadoop Ozone development started on a feature branch of Hadoop 
> repository (HDFS-7240)
> 
> * In the October of 2017 a discussion has been started to merge it to the 
> Hadoop main branch
> 
> * After a long discussion it's merged to Hadoop trunk at the March of 2018
> 
> * During the discussion of the merge, it was suggested multiple times to 
> create a separated project for the Ozone. But at that time:
>1). Ozone was tightly integrated with Hadoop/HDFS
>2). There was an active plan to use Block layer of Ozone (HDDS or HDSL at 
> that time) as the block level of HDFS
>3). The community of Ozone was a subset of the HDFS community
> 
> * The first beta release of Ozone was just released. Seems to be a good time 
> before the first GA to make a decision about the future.
> 
> 
> 
> ### WHAT HAS BEEN CHANGED
> 
> During the last years Ozone became more and more independent both at the 
> community and code side. The separation has been suggested again and again 
> (for example by Owen [2] and Vinod [3])
> 
> 
> 
> From COMMUNITY point of view:
> 
> 
>  * Fortunately more and more new contributors are helping Ozone. Originally 
> the Ozone community was a subset of HDFS project. But now a bigger and bigger 
> part of the community is related to Ozone only.
> 
>  * It seems to be easier to _build_ the community as a separated project.
> 
>  * A new, younger project might have different practices (communication, 
> commiter criteria, development style) compared to old, mature project
> 
>  * It's easier to communicate (and improve) these standards in a separated 
> projects with clean boundaries
> 
>  * Separated project/brand can help to increase the adoption rate and attract 
> more individual contributor (AFAIK it has been seen in Submarine after a 
> similar move)
> 
> * Contribution process can be communicated more easily, we can make first 
> time contribution more easy
> 
> 
> 
> From CODE point of view Ozone became more and more independent:
> 
> 
> * Ozone has different release cycle
> 
> * Code is already separated from Hadoop code base (apache/hadoop-ozone.git)
> 
> * It has separated CI (github actions)
> 
> * Ozone uses different (more strict) coding style (zero toleration of unit 
> test / checkstyle errors)
> 
> * The code itself became more and more independent from Hadoop on Maven 
> level. Originally it was compiled together with the in-tree latest Hadoop 
> snapshot. Now it depends on released Hadoop artifacts (RPC, Configuration...)
> 
> * It starts to use multiple version of Hadoop (on client side)
> 
> * Volume of resolved issues are already very high on Ozone side (Ozone had 
> slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all together in 
> the last 2-3 months)
> 
> 
> Summary: Before the first Ozone GA release, It seems to be a good time to 
> discuss the long-term future of Ozone. Managing it as a separated TLP project 
> seems to have more benefits.
> 
> 
> Please let me know what your opinion is...
> 
> Thanks a lot,
> Marton
> 
> 
> 
> 
> 
> [1]: For more details, see: 
> https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md
> 
> [2]: 
> https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E
> 
> [3]: 
> https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E
> 
> -
> To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Feature branch for HDFS-14978 In-place Erasure Coding Conversion

2020-01-23 Thread Anu Engineer
+1



> On Jan 23, 2020, at 2:51 PM, Jitendra Pandey  
> wrote:
> 
> +1 for the feature branch.
> 
>> On Thu, Jan 23, 2020 at 1:34 PM Wei-Chiu Chuang
>>  wrote:
>> 
>> Hi we are working on a feature to improve Erasure Coding, and I would like
>> to seek your opinion on creating a feature branch for it. (HDFS-14978
>> )
>> 
>> Reason for a feature branch
>> (1) it turns out we need to update NameNode layout version
>> (2) It's a medium size project and we want to get this feature merged in
>> its entirety.
>> 
>> Aravindan Vijayan and I are planning to work on this feature.
>> 
>> Thoughts?
>> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Ozone 0.4.2 release

2019-12-06 Thread Anu Engineer
+1

— Anu

> On Dec 6, 2019, at 5:26 PM, Dinesh Chitlangia  wrote:
> 
> All,
> Since the Apache Hadoop Ozone 0.4.1 release, we have had significant
> bug fixes towards performance & stability.
> 
> With that in mind, 0.4.2 release would be good to consolidate all those fixes.
> 
> Pls share your thoughts.
> 
> 
> Thanks,
> Dinesh Chitlangia

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Remove Ozone and Submarine from Hadoop repo

2019-10-28 Thread Anu Engineer
@Vinod Kumar Vavilapalli 
  Do we need a separate vote thread for this? there are already JIRAs in
place for ozone code removal and I gather it is same for Submarine.
Would it be possible to treat this thread as consensus and act upon the
JIRA itself?

Thanks
Anu


On Sun, Oct 27, 2019 at 6:58 PM 俊平堵  wrote:

> +1.
>
> Thanks,
>
> Junping
>
> Akira Ajisaka  于2019年10月24日周四 下午3:21写道:
>
> > Hi folks,
> >
> > Both Ozone and Apache Submarine have separate repositories.
> > Can we remove these modules from hadoop-trunk?
> >
> > Regards,
> > Akira
> >
>


Re: [DISCUSS] Remove Ozone and Submarine from Hadoop repo

2019-10-24 Thread Anu Engineer
+1 for Ozone. We are in our own repo now. It would be good to remove this
code from Hadoop, otherwise it will confuse new contributors.
I would like to add a git tag tro Hadoop, so that people have the ability
to sync back and see the code evolution.

--Anu

On Thu, Oct 24, 2019 at 4:03 PM Giovanni Matteo Fumarola <
giovanni.fumar...@gmail.com> wrote:

> +1
>
> Thanks Wei-Chiu for creating HADOOP-16670.
>
> On Thu, Oct 24, 2019 at 12:56 PM Wei-Chiu Chuang 
> wrote:
>
> > +1 filed HADOOP-16670 <
> https://issues.apache.org/jira/browse/HADOOP-16670>
> > for
> > stripping the Submarine code.
> >
> > On Thu, Oct 24, 2019 at 12:14 PM Subru Krishnan 
> wrote:
> >
> > > +1.
> > >
> > > Thanks,
> > > Subru
> > >
> > > On Thu, Oct 24, 2019 at 12:51 AM 张铎(Duo Zhang) 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Akira Ajisaka  于2019年10月24日周四 下午3:21写道:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > Both Ozone and Apache Submarine have separate repositories.
> > > > > Can we remove these modules from hadoop-trunk?
> > > > >
> > > > > Regards,
> > > > > Akira
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Release Apache Hadoop Ozone 0.4.1-alpha

2019-10-12 Thread Anu Engineer
+1, Binding.

Verified the KEYS
Built from sources and ran tests:
   - General Ozone command line tests
   - Applications like MR and YARN.

--Anu


On Sat, Oct 12, 2019 at 10:25 AM Xiaoyu Yao 
wrote:

> +1 binding. Verified
> * Verify the signature.
> * Build from source.
> * Deploy docker compose in secure mode and verify ACL, sample MR jobs
>
> Thanks,
> Xiaoyu
>
> On Fri, Oct 11, 2019 at 5:37 PM Hanisha Koneru
> 
> wrote:
>
> > Thank you Nanda for putting up the RC.
> >
> > +1 binding.
> >
> > Verified the following:
> >   - Built from source
> >   - Deployed to 5 node cluster and ran smoke tests.
> >   - Ran sanity checks
> >
> > Thanks
> > Hanisha
> >
> > > On Oct 4, 2019, at 10:42 AM, Nanda kumar  wrote:
> > >
> > > Hi Folks,
> > >
> > > I have put together RC0 for Apache Hadoop Ozone 0.4.1-alpha.
> > >
> > > The artifacts are at:
> > > https://home.apache.org/~nanda/ozone/release/0.4.1/RC0/
> > >
> > > The maven artifacts are staged at:
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1238/
> > >
> > > The RC tag in git is at:
> > > https://github.com/apache/hadoop/tree/ozone-0.4.1-alpha-RC0
> > >
> > > And the public key used for signing the artifacts can be found at:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > >
> > > This release contains 363 fixes/improvements [1].
> > > Thanks to everyone who put in the effort to make this happen.
> > >
> > > *The vote will run for 7 days, ending on October 11th at 11:59 pm IST.*
> > > Note: This release is alpha quality, it’s not recommended to use in
> > > production but we believe that it’s stable enough to try out the
> feature
> > > set and collect feedback.
> > >
> > >
> > > [1] https://s.apache.org/yfudc
> > >
> > > Thanks,
> > > Team Ozone
> >
> >
>


Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-17 Thread Anu Engineer
+1
—Anu

> On Sep 17, 2019, at 2:49 AM, Elek, Marton  wrote:
> 
> 
> 
> TLDR; I propose to move Ozone related code out from Hadoop trunk and store it 
> in a separated *Hadoop* git repository apache/hadoop-ozone.git
> 
> 
> 
> 
> When Ozone was adopted as a new Hadoop subproject it was proposed[1] to be 
> part of the source tree but with separated release cadence, mainly because it 
> had the hadoop-trunk/SNAPSHOT as compile time dependency.
> 
> During the last Ozone releases this dependency is removed to provide more 
> stable releases. Instead of using the latest trunk/SNAPSHOT build from 
> Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).
> 
> As we have no more strict dependency between Hadoop trunk SNAPSHOT and Ozone 
> trunk I propose to separate the two code base from each other with creating a 
> new Hadoop git repository (apache/hadoop-ozone.git):
> 
> With moving Ozone to a separated git repository:
> 
> * It would be easier to contribute and understand the build (as of now we 
> always need `-f pom.ozone.xml` as a Maven parameter)
> * It would be possible to adjust build process without breaking Hadoop/Ozone 
> builds.
> * It would be possible to use different Readme/.asf.yaml/github template for 
> the Hadoop Ozone and core Hadoop. (For example the current github template 
> [2] has a link to the contribution guideline [3]. Ozone has an extended 
> version [4] from this guideline with additional information.)
> * Testing would be more safe as it won't be possible to change core Hadoop 
> and Hadoop Ozone in the same patch.
> * It would be easier to cut branches for Hadoop releases (based on the 
> original consensus, Ozone should be removed from all the release branches 
> after creating relase branches from trunk)
> 
> 
> What do you think?
> 
> Thanks,
> Marton
> 
> [1]: 
> https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
> [2]: 
> https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md
> [3]: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
> [4]: 
> https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] ARM/aarch64 support for Hadoop

2019-09-05 Thread Anu Engineer
ng
> >> > for the jobs but I did a little search, according to:
> >> >
> >> >
> >>
> https://packages.ubuntu.com/search?keywords=protobuf-compiler=names
> >> > &
> >> >
> >> >
> >>
> https://packages.ubuntu.com/search?suite=default=all=any=libprotoc-dev=names
> >> > it both said that the version of libprotc-dev and protobuf-compiler
> >> > available for ubuntu 18.04 is 3.0.0
> >> >
> >> >
> >> > On Wed, Sep 4, 2019 at 4:39 PM Ayush Saxena 
> wrote:
> >> >
> >> >> Thanx Vinay for the initiative, Makes sense to add support for
> >> different
> >> >> architectures.
> >> >>
> >> >> +1, for the branch idea.
> >> >> Good Luck!!!
> >> >>
> >> >> -Ayush
> >> >>
> >> >> > On 03-Sep-2019, at 6:19 AM, 张铎(Duo Zhang) 
> >> >> wrote:
> >> >> >
> >> >> > For HBase, we purged all the protobuf related things from the
> public
> >> >> API,
> >> >> > and then upgraded to a shaded and relocated version of protobuf. We
> >> have
> >> >> > created a repo for this:
> >> >> >
> >> >> > https://github.com/apache/hbase-thirdparty
> >> >> >
> >> >> > But since the hadoop dependencies still pull in the protobuf 2.5
> >> jars,
> >> >> our
> >> >> > coprocessors are still on protobuf 2.5. Recently we have opened a
> >> >> discuss
> >> >> > on how to deal with the upgrading of coprocessor. Glad to see that
> >> the
> >> >> > hadoop community is also willing to solve the problem.
> >> >> >
> >> >> > Anu Engineer  于2019年9月3日周二
> 上午1:23写道:
> >> >> >
> >> >> >> +1, for the branch idea. Just FYI, Your biggest problem is proving
> >> that
> >> >> >> Hadoop and the downstream projects work correctly after you
> upgrade
> >> >> core
> >> >> >> components like Protobuf.
> >> >> >> So while branching and working on a branch is easy, merging back
> >> after
> >> >> you
> >> >> >> upgrade some of these core components is insanely hard. You might
> >> want
> >> >> to
> >> >> >> make sure that community buys into upgrading these components in
> the
> >> >> trunk.
> >> >> >> That way we will get testing and downstream components will notice
> >> when
> >> >> >> things break.
> >> >> >>
> >> >> >> That said, I have lobbied for the upgrade of Protobuf for a really
> >> long
> >> >> >> time; I have argued that 2.5 is out of support and we cannot stay
> on
> >> >> that
> >> >> >> branch forever; or we need to take ownership of the Protobuf 2.5
> >> code
> >> >> base.
> >> >> >> It has been rightly pointed to me that while all the arguments I
> >> make
> >> >> is
> >> >> >> correct; it is a very complicated task to upgrade Protobuf, and
> the
> >> >> worst
> >> >> >> part is we will not even know what breaks until downstream
> projects
> >> >> pick up
> >> >> >> these changes and work against us.
> >> >> >>
> >> >> >> If we work off the Hadoop version 3 — and assume that we have
> >> >> "shading" in
> >> >> >> place for all deployments; it might be possible to get there;
> still
> >> a
> >> >> >> daunting task.
> >> >> >>
> >> >> >> So best of luck with the branch approach — But please remember,
> >> Merging
> >> >> >> back will be hard, Just my 2 cents.
> >> >> >>
> >> >> >> — Anu
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <
> >> zhengzhenyul...@gmail.com
> >> >> >
> >> >> >> wrote:
> >> >> >>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> 

Re: [DISCUSS] ARM/aarch64 support for Hadoop

2019-09-02 Thread Anu Engineer
+1, for the branch idea. Just FYI, Your biggest problem is proving that
Hadoop and the downstream projects work correctly after you upgrade core
components like Protobuf.
So while branching and working on a branch is easy, merging back after you
upgrade some of these core components is insanely hard. You might want to
make sure that community buys into upgrading these components in the trunk.
That way we will get testing and downstream components will notice when
things break.

That said, I have lobbied for the upgrade of Protobuf for a really long
time; I have argued that 2.5 is out of support and we cannot stay on that
branch forever; or we need to take ownership of the Protobuf 2.5 code base.
It has been rightly pointed to me that while all the arguments I make is
correct; it is a very complicated task to upgrade Protobuf, and the worst
part is we will not even know what breaks until downstream projects pick up
these changes and work against us.

If we work off the Hadoop version 3 — and assume that we have "shading" in
place for all deployments; it might be possible to get there; still a
daunting task.

So best of luck with the branch approach — But please remember, Merging
back will be hard, Just my 2 cents.

— Anu




On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng 
wrote:

> Hi,
>
> Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate
> branch with it's own ARM CI seems a really good idea.
> By doing this we won't break any of the undergoing development in trunk and
> a CI can be a very good way to show what are the
> current problems and what have been fixed, it will also provide a very good
> view for contributors that are intrested to working on
> this. We can finally merge back the branch to trunk until the community
> thinks it is good enough and stable enough. We can donate
> ARM machines to the existing CI system for the job.
>
> I wonder if this approch possible?
>
> BR,
>
> On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu  wrote:
>
> > Hi,
> >
> > Thanks Vinay for bring this up, I am a member of "Openlab" community
> > mentioned by Vinay. I am working on building and
> > testing Hadoop components on aarch64 server these days, besides the
> missing
> > dependices of ARM platform issues #1 #2 #3
> > mentioned by Vinay, other similar issue has also be found, such as the
> > "PhantomJS" dependent package also missing for aarch64.
> >
> > To promote the ARM support for Hadoop, we have discussed and hoped to add
> > an ARM specific CI to Hadoop repo. we are not
> > sure about if there is any potential effect or confilict on the trunk
> > branch, so maybe creating a ARM specific branch for doing these stuff
> > is a better choice, what do you think?
> >
> > Hope to hear thoughts from you :)
> >
> > BR,
> > Liu sheng
> >
> > Vinayakumar B  于2019年8月27日周二 上午5:34写道:
> >
> > > Hi Folks,
> > >
> > > ARM is becoming famous lately in its processing capability and has got
> > the
> > > potential to run Bigdata workloads.
> > > Many users have been moving to ARM machines due to its low cost.
> > >
> > > In the past there were attempts to compile Hadoop on ARM (Rasberry PI)
> > for
> > > experimental purposes. Today ARM architecture is taking some of the
> > > serverside processing as well. So there will be/is a real need of
> Hadoop
> > to
> > > support ARM architecture as well.
> > >
> > > There are bunch of users who are trying out building Hadoop on ARM,
> > trying
> > > to add ARM CI to hadoop and facing issues[1]. Also some
> > >
> > > As of today, Hadoop does not compile on ARM due to below issues, found
> > from
> > > testing done in openlab in [2].
> > >
> > > 1. Protobuf :
> > > ---
> > >  Hadoop project (also some downstream projects) stuck to protobuf
> > 2.5.0
> > > version, due to backward compatibility reasons. Protobuf-2.5.0 is not
> > being
> > > maintained in the community. While protobuf 3.x is being actively
> adopted
> > > widely, still protobuf 3.x provides wire compatibility for proto2
> > messages.
> > > Due to some compilation issues in the generated java code, which can
> > induce
> > > problems in downstream. Due to this reason protobuf upgrade from 2.5.0
> > was
> > > not taken up.
> > > In 3.0.0 onwards, hadoop supports shading of libraries to avoid
> classpath
> > > problem in downstream projects.
> > > There are patches available to fix compilation in Hadoop. But need
> to
> > > find a way to upgrade protobuf to latest version and still maintain the
> > > downstream's classpath using shading feature of Hadoop build.
> > >
> > >  There is a Jira for protobuf upgrade[3] created even before shade
> > > support was added to Hadoop. Now need to revisit the Jira and continue
> > > explore possibilities.
> > >
> > > 2. leveldbjni:
> > > ---
> > > Current leveldbjni used in YARN doesnot support ARM architecture,
> > need
> > > to check whether any of the future versions support ARM and can hadoop
> > > upgrade to that version.
> > 

Re: [DISCUSS] A unified and open Hadoop community sync up schedule?

2019-06-11 Thread Anu Engineer
For Ozone, we have started using the Wiki itself as the agenda and after
the meeting is over, we convert it into the meeting notes.
Here is an example, the project owner can edit and maintain it, it is like
10 mins work - and allows anyone to add stuff into the agenda too.

https://cwiki.apache.org/confluence/display/HADOOP/2019-06-10+Meeting+notes

--Anu

On Tue, Jun 11, 2019 at 10:20 AM Yufei Gu  wrote:

> +1 for this idea. Thanks Wangda for bringing this up.
>
> Some comments to share:
>
>- Agenda needed to be posted ahead of meeting and welcome any interested
>party to contribute to topics.
>- We should encourage more people to attend. That's whole point of the
>meeting.
>- Hopefully, this can mitigate the situation that some patches are
>waiting for review for ever, which turns away new contributors.
>- 30m per session sounds a little bit short, we can try it out and see
>if extension is needed.
>
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Fri, Jun 7, 2019 at 4:39 PM Wangda Tan  wrote:
>
> > Hi Hadoop-devs,
> >
> > Previous we have regular YARN community sync up (1 hr, biweekly, but not
> > open to public). Recently because of changes in our schedules, Less folks
> > showed up in the sync up for the last several months.
> >
> > I saw the K8s community did a pretty good job to run their sig meetings,
> > there's regular meetings for different topics, notes, agenda, etc. Such
> as
> >
> >
> https://docs.google.com/document/d/13mwye7nvrmV11q9_Eg77z-1w3X7Q1GTbslpml4J7F3A/edit
> >
> >
> > For Hadoop community, there are less such regular meetings open to the
> > public except for Ozone project and offline meetups or Bird-of-Features
> in
> > Hadoop/DataWorks Summit. Recently we have a few folks joined DataWorks
> > Summit at Washington DC and Barcelona, and lots (50+) of folks join the
> > Ozone/Hadoop/YARN BoF, ask (good) questions and roadmaps. I think it is
> > important to open such conversations to the public and let more
> > folk/companies join.
> >
> > Discussed a small group of community members and wrote a short proposal
> > about the form, time and topic of the community sync up, thanks for
> > everybody who have contributed to the proposal! Please feel free to add
> > your thoughts to the Proposal Google doc
> > <
> >
> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
> > >
> > .
> >
> > Especially for the following parts:
> > - If you have interests to run any of the community sync-ups, please put
> > your name to the table inside the proposal. We need more volunteers to
> help
> > run the sync-ups in different timezones.
> > - Please add suggestions to the time, frequency and themes and feel free
> to
> > share your thoughts if we should do sync ups for other topics which are
> not
> > covered by the proposal.
> >
> > Link to the Proposal Google doc
> > <
> >
> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
> > >
> >
> > Thanks,
> > Wangda Tan
> >
>


Re: [VOTE] Unprotect HDFS-13891 (HDFS RBF Branch)

2019-05-14 Thread Anu Engineer
Is it possible to unprotect the branches and not the trunk? Generally, a
force push to trunk indicates a mistake and we have had that in the past.
This is just a suggestion,  even if this request is not met, I am still +1.

Thanks
Anu



On Tue, May 14, 2019 at 4:58 AM Takanobu Asanuma 
wrote:

> +1.
>
> Thanks!
> - Takanobu
>
> 
> From: Akira Ajisaka 
> Sent: Tuesday, May 14, 2019 4:26:30 PM
> To: Giovanni Matteo Fumarola
> Cc: Iñigo Goiri; Brahma Reddy Battula; Hadoop Common; Hdfs-dev
> Subject: Re: [VOTE] Unprotect HDFS-13891 (HDFS RBF Branch)
>
> +1 to unprotect the branch.
>
> Thanks,
> Akira
>
> On Tue, May 14, 2019 at 3:11 PM Giovanni Matteo Fumarola
>  wrote:
> >
> > +1 to unprotect the branches for rebases.
> >
> > On Mon, May 13, 2019 at 11:01 PM Iñigo Goiri  wrote:
> >
> > > Syncing the branch to trunk should be a fairly standard task.
> > > Is there a way to do this without rebasing and forcing the push?
> > > As far as I know this has been the standard for other branches and I
> don't
> > > know of any alternative.
> > > We should clarify the process as having to get PMC consensus to rebase
> a
> > > branch seems a little overkill to me.
> > >
> > > +1 from my side to un protect the branch to do the rebase.
> > >
> > > On Mon, May 13, 2019, 22:46 Brahma Reddy Battula 
> > > wrote:
> > >
> > > > Hi Folks,
> > > >
> > > > INFRA-18181 made all the Hadoop branches are protected.
> > > > Badly HDFS-13891 branch needs to rebased as we contribute core
> patches
> > > > trunk..So,currently we are stuck with rebase as it’s not allowed to
> force
> > > > push.Hence raised INFRA-18361.
> > > >
> > > > Can we have a quick vote for INFRA sign-off to proceed as this is
> > > blocking
> > > > all branch commits??
> > > >
> > > > --
> > > >
> > > >
> > > >
> > > > --Brahma Reddy Battula
> > > >
> > >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: VOTE: Hadoop Ozone 0.4.0-alpha RC2

2019-05-07 Thread Anu Engineer
+1 (Binding)

-- Built from sources.
-- Ran smoke tests and verified them.

--Anu


On Sun, May 5, 2019 at 8:05 PM Xiaoyu Yao  wrote:

> +1 Binding. Thanks all who contributed to the release.
>
> + Download sources and verify signature.
> + Build from source and ran docker-based ad-hot security tests.
> ++ From 1 datanode scale to 3 datanodes, verify certificates were
> correctly issued when security enabled
> ++ Smoke test for both non-secure and secure mode.
> ++ Put/Get/Delete/Rename Key with
> +++ Kerberos testing
> +++ Delegation token testing with DTUtil CLI and MR jobs.
> +++ S3 token.
>
> Just have one minor question for the expanded source code which points to
> hadoop-3.3.0-SNAPSHOT-src-with-hdds/hadoop-ozone. But in
> hadoop-ozone/pom.xml, we explicitly declare dependency on Hadoop 3.2.0.
> I understand we just take the trunk source code(3.3.0-SNAPSHOT up to the
> ozone-0.4 RC) here, should we fix this by giving the git hash of the trunk
> or clarify it to avoid confusion?
> This might be done by just updating the name of the binaries without reset
> the release itself.
>
> -Xiaoyu
>
>
> On 5/3/19, 4:07 PM, "Dinesh Chitlangia" 
> wrote:
>
> +1 (non-binding)
>
> - Built from sources and ran smoke test
> - Verified all checksums
> - Toggled audit log and verified audit parser tool
>
> Thanks Ajay for organizing the release.
>
> Cheers,
> Dinesh
>
>
>
> On 5/3/19, 5:42 PM, "Eric Yang"  wrote:
>
> +1
>
> On 4/29/19, 9:05 PM, "Ajay Kumar" 
> wrote:
>
> Hi All,
>
>
>
> We have created the third release candidate (RC2) for Apache
> Hadoop Ozone 0.4.0-alpha.
>
>
>
> This release contains security payload for Ozone. Below are
> some important features in it:
>
>
>
>   *   Hadoop Delegation Tokens and Block Tokens supported for
> Ozone.
>   *   Transparent Data Encryption (TDE) Support - Allows data
> blocks to be encrypted-at-rest.
>   *   Kerberos support for Ozone.
>   *   Certificate Infrastructure for Ozone  - Tokens use PKI
> instead of shared secrets.
>   *   Datanode to Datanode communication secured via mutual
> TLS.
>   *   Ability secure ozone cluster that works with Yarn, Hive,
> and Spark.
>   *   Skaffold support to deploy Ozone clusters on K8s.
>   *   Support S3 Authentication Mechanisms like - S3 v4
> Authentication protocol.
>   *   S3 Gateway supports Multipart upload.
>   *   S3A file system is tested and supported.
>   *   Support for Tracing and Profiling for all Ozone
> components.
>   *   Audit Support - including Audit Parser tools.
>   *   Apache Ranger Support in Ozone.
>   *   Extensive failure testing for Ozone.
>
> The RC artifacts are available at
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc2/
>
>
>
> The RC tag in git is ozone-0.4.0-alpha-RC2 (git hash
> 4ea602c1ee7b5e1a5560c6cbd096de4b140f776b)
>
>
>
> Please try out<
> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+Apache+Release>,
> vote, or just give us feedback.
>
>
>
> The vote will run for 5 days, ending on May 4, 2019, 04:00 UTC.
>
>
>
> Thank you very much,
>
> Ajay
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>


[jira] [Resolved] (HADOOP-16026) Replace incorrect use of system property user.name

2019-04-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HADOOP-16026.
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 3.3.0
Target Version/s: 3.3.0

[~dineshchitlangia] Thanks for the contribution. [~jojochuang] Thanks for the 
review. I have committed this patch to trunk.

> Replace incorrect use of system property user.name
> --
>
> Key: HADOOP-16026
> URL: https://issues.apache.org/jira/browse/HADOOP-16026
> Project: Hadoop Common
>  Issue Type: Improvement
> Environment: Kerberized
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 3.3.0
>
>
> This jira has been created to track the suggested changes for Hadoop Common 
> as identified in HDFS-14176
> Following occurrence need to be corrected:
>  Common/FileSystem L2233
>  Common/AbstractFileSystem L451
>  Common/SshFenceByTcpPort L239



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: VOTE: Hadoop Ozone 0.4.0-alpha RC1

2019-04-19 Thread Anu Engineer
+1 (Binding)

-- Verified the checksums.
-- Built from sources.
-- Sniff tested the functionality.

--Anu


On Mon, Apr 15, 2019 at 4:09 PM Ajay Kumar 
wrote:

> Hi all,
>
> We have created the second release candidate (RC1) for Apache Hadoop Ozone
> 0.4.0-alpha.
>
> This release contains security payload for Ozone. Below are some important
> features in it:
>
>   *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>   *   Transparent Data Encryption (TDE) Support - Allows data blocks to be
> encrypted-at-rest.
>   *   Kerberos support for Ozone.
>   *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of
> shared secrets.
>   *   Datanode to Datanode communication secured via mutual TLS.
>   *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>   *   Skaffold support to deploy Ozone clusters on K8s.
>   *   Support S3 Authentication Mechanisms like - S3 v4 Authentication
> protocol.
>   *   S3 Gateway supports Multipart upload.
>   *   S3A file system is tested and supported.
>   *   Support for Tracing and Profiling for all Ozone components.
>   *   Audit Support - including Audit Parser tools.
>   *   Apache Ranger Support in Ozone.
>   *   Extensive failure testing for Ozone.
>
> The RC artifacts are available at
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc1
>
> The RC tag in git is ozone-0.4.0-alpha-RC1 (git hash
> d673e16d14bb9377f27c9017e2ffc1bcb03eebfb)
>
> Please try out<
> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+Apache+Release>,
> vote, or just give us feedback.
>
> The vote will run for 5 days, ending on April 20, 2019, 19:00 UTC.
>
> Thank you very much,
>
> Ajay
>
>
>


Re: [DISCUSS] Use of AssertJ for testing

2019-04-08 Thread Anu Engineer
+1, on AssertJ usage, thanks for getting this done.
--Anu


On 3/31/19, 9:37 PM, "Akira Ajisaka"  wrote:

Hi folks,

Now I'm going to upgrade the JUnit version from 4 to 5 for Java 11 support.
I wanted to start with the small module, so I uploaded a patch to upgrade
the API in hadoop-yarn-api module at first (YARN-8943), and in this JIRA,
Szilard Nemeth suggested using AssertJ with JUnit 5. (Thanks Szilard
for the suggestion!)

I think the JUnit upgrade and the use of AssertJ are separated, but
related tasks.
Therefore, I'd like to decide:
- Use AssertJ or not
- If we are going to use AssertJ, when to use AssertJ (before
upgrading JUnit or after?)

My opinion is:
- JUnit migration is required for Java 11, so upgrading JUnit as soon
as possible.
- After the migration, we may use AssertJ for existing tests.
- We may use AssertJ for new tests. (not must)

Any thoughts?

Thanks,
Akira

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org





[jira] [Created] (HADOOP-16215) Genconfig does not generate LOG4j configs

2019-03-27 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-16215:
-

 Summary: Genconfig does not generate LOG4j configs
 Key: HADOOP-16215
 URL: https://issues.apache.org/jira/browse/HADOOP-16215
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 0.3.0
Reporter: Hrishikesh Gadre
Assignee: Hrishikesh Gadre


Genconfig does not generate Log4J configs, This is needed for Ozone configs to 
work correctly. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Moving branch-2 to java 8

2019-02-04 Thread Anu Engineer
Konstantin, 

Just a nitpicky thought, if we move this branch to Java-8 on Jenkins, but still 
hope to release code that can run on Java 7, how will we detect
Java 8 only changes? I am asking because till now whenever I checked in Java 8 
features in branch-2 Jenkins would catch that issue.

With this approach, we might not find it out the issues till the release time 
when the release manager decides to compile with Java 7.
It might be more pragmatic to say that your Java 7 mileage may vary once this 
goes in, since we will have no visibility to Java 7 compatibility until it is 
too late.

Another approach could be that we create a read-only 2.x branch, then we know 
that code will work with Java 7 since the last snapshot was known to work with 
Java 7.


Thanks
Anu



On 2/1/19, 5:04 PM, "Konstantin Shvachko"  wrote:

Just to make sure we are on the same page, as the subject of this thread is
too generic and confusing.
*The proposal is to move branch-2 Jenkins builds such as precommit to run
tests on openJDK-8.*
We do not want to break Java 7 source compatibility. The sources and
releases will still depend on Java 7.
We don't see test failures discussed in HADOOP-15711 when we run them
locally with Oracle Java 7.

Thanks,
--Konst

On Fri, Feb 1, 2019 at 12:44 PM Jonathan Hung  wrote:

> Thanks Vinod and Steve, agreed about java7 compile compatibility. At least
> for now, we should be able to maintain java7 source compatibility and run
> tests on java8. There's a test run here:
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86-jhung/46/
> which calls a java8 specific API, installs both openjdk7/openjdk8 in the
> dockerfile, compiles on both versions, and tests on just java8 (via
>
> 
--multijdkdirs=/usr/lib/jvm/java-7-openjdk-amd64,/usr/lib/jvm/java-8-openjdk-amd64
> and --multijdktests=compile). If we eventually decide it's too much of a
> pain to maintain java7 source compatibility we can do that at a later
> point.
>
> Also based on discussion with others in the community at the contributors
> meetup this past Wednesday, seems we are generally in favor of testing
> against java8. I'll start a vote soon.
>
> Jonathan Hung
>
>
> On Tue, Jan 29, 2019 at 4:11 AM Steve Loughran 
> wrote:
>
> > branch-2 is the JDK 7 branch, but for a long time I (and presumably
> > others) have relied on jenkins to keep us honest by doing that build and
> > test
> >
> > right now, we can't do that any more, due to jdk7 bugs which will never
> be
> > fixed by oracle, or at least, not in a public release.
> >
> > If we can still do the compile in java 7 language and link to java 7 
JDK,
> > then that bit of the release is good -then java 8 can be used for that
> test
> >
> > Ultimately, we're going to be forced onto java 8 just because all our
> > dependencies have moved onto it, and some CVE will force us to move.
> >
> > At which point, I think its time to declare branch-2 dead. It's had a
> > great life, but trying to keep java 7 support alive isn't sustainable.
> Not
> > just in this testing, but
> > cherrypicking patches back gets more and more difficult -branch-3 has
> > moved on in both use of java 8 language, and in the codebase in general.
> >
> > > On 28 Jan 2019, at 20:18, Vinod Kumar Vavilapalli 
> > wrote:
> > >
> > > The community made a decision long time ago that we'd like to keep the
> > compatibility & so tie branch-2 to Java 7, but do Java 8+ only work on
> 3.x.
> > >
> > > I always assumed that most (all?) downstream users build branch-2 on
> JDK
> > 7 only, can anyone confirm? If so, there may be an easier way to address
> > these test issues.
> > >
> > > +Vinod
> > >
> > >> On Jan 28, 2019, at 11:24 AM, Jonathan Hung 
> > wrote:
> > >>
> > >> Hi folks,
> > >>
> > >> Forking a discussion based on HADOOP-15711. To summarize, there are
> > issues
> > >> with branch-2 tests running on java 7 (openjdk) which don't exist on
> > java
> > >> 8. From our testing, the build can pass with openjdk 8.
> > >>
> > >> For branch-3, the work to move the build to use java 8 was done in
> > >> HADOOP-14816 as part of the Dockerfile OS version change. 
HADOOP-16053
> > was
> > >> filed to backport this OS version change to branch-2 (but without the
> > java
> > >> 7 -> java 8 change). So my proposal is to also make the java 7 ->
> java 8
> > >> version change in branch-2.
> > >>
> > >> As mentioned in HADOOP-15711, the main issue is around source and
> binary
> > >> compatibility. I don't currently have a great answer, but one initial
> > >> thought is to build source/binary against java 7 to ensure
> compatibility
> > >> 

Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Anu Engineer
+1
--Anu


On 2/1/19, 3:02 PM, "Jonathan Hung"  wrote:

+1. Thanks Wangda.

Jonathan Hung


On Fri, Feb 1, 2019 at 2:25 PM Dinesh Chitlangia <
dchitlan...@hortonworks.com> wrote:

> +1 (non binding), thanks Wangda for organizing this.
>
> Regards,
> Dinesh
>
>
>
> On 2/1/19, 5:24 PM, "Wangda Tan"  wrote:
>
> Hi all,
>
> According to positive feedbacks from the thread [1]
>
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
>
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> 
https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
>
>
>




Re: [DISCUSS] Making submarine to different release model like Ozone

2019-01-31 Thread Anu Engineer
>> I propose to adopt Ozone model: which is the same master branch, different
>> release cycle, and different release branch. It is a great example to show
>> agile release we can do (2 Ozone releases after Oct 2018) with less
>> overhead to setup CI, projects, etc.


I second this, especially this allows Submarine to be used by Hadoop users 
without having
To upgrade to new versions. The new changes in Submarine can be used and tested 
by 
End-users much faster with this model.

A resounding +1 from me, based on experiences from ozone.

Thanks
Anu




On 1/31/19, 11:52 AM, "Jonathan Hung"  wrote:

+1. This is important for improving the deep learning on hadoop story.
There's recently a lot of momentum for this, and decoupling
submarine/hadoop will help it continue.

Jonathan Hung


On Thu, Jan 31, 2019 at 11:04 AM Wangda Tan  wrote:

> Hi devs,
>
> Since we started submarine-related effort last year, we received a lot of
> feedbacks, several companies (such as Netease, China Mobile, etc.)  are
> trying to deploy Submarine to their Hadoop cluster along with big data
> workloads. Linkedin also has big interests to contribute a Submarine TonY 
(
> https://github.com/linkedin/TonY) runtime to allow users to use the same
> interface.
>
> From what I can see, there're several issues of putting Submarine under
> yarn-applications directory and have same release cycle with Hadoop:
>
> 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> 2019. Because of non-predictable blockers and security issues, it got
> delayed a lot. We need to iterate submarine fast at this point.
>
> 2) We also see a lot of requirements to use Submarine on older Hadoop
> releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> short time, but the requirement to run deep learning is urgent to them. We
> should decouple Submarine from Hadoop version.
>
> And why we wanna to keep it within Hadoop? First, Submarine included some
> innovation parts such as enhancements of user experiences for YARN
> services/containerization support which we can add it back to Hadoop later
> to address common requirements. In addition to that, we have a big overlap
> in the community developing and using it.
>
> There're several proposals we have went through during Ozone merge to 
trunk
> discussion:
>
> 
https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
>
> I propose to adopt Ozone model: which is the same master branch, different
> release cycle, and different release branch. It is a great example to show
> agile release we can do (2 Ozone releases after Oct 2018) with less
> overhead to setup CI, projects, etc.
>
> *Links:*
> - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> - Design doc
> <
> 
https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
> >
> - User doc
> <
> 
https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
> >
> (3.2.0
> release)
> - Blogposts, {Submarine} : Running deep learning workloads on Apache 
Hadoop
> <
> 
https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
> >,
> (Chinese Translation: Link )
> - Talks: Strata Data Conf NY
> <
> 
https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289
> >
>
> Thoughts?
>
> Thanks,
> Wangda Tan
>




Re: proposed new repository for hadoop/ozone docker images (+update on docker works)

2019-01-29 Thread Anu Engineer
Marton please correct me I am wrong, but I believe that without this branch it 
is hard for us to push to Apache DockerHub. This allows for Apache account 
integration and dockerHub.
Does YARN publish to the Docker Hub via Apache account?


Thanks
Anu


On 1/29/19, 4:54 PM, "Eric Yang"  wrote:

By separating Hadoop docker related build into a separate git repository 
have some slippery slope.  It is harder to synchronize the changes between two 
separate source trees.  There is multi-steps process to build jar, tarball, and 
docker images.  This might be problematic to reproduce.

It would be best to arrange code such that docker image build process can 
be invoked as part of maven build process.  The profile is activated only if 
docker is installed and running on the environment.  This allows to produce 
jar, tarball, and docker images all at once without hindering existing build 
procedure.

YARN-7129 is one of the examples that making a subproject in YARN to build 
a docker image that can run in YARN.  It automatically detects presence of 
docker and build docker image when docker is available.  If docker is not 
running, the subproject skips and proceed to next sub-project.  Please try out 
YARN-7129 style of build process, and see this is a possible solution to solve 
docker image generation issue?  Thanks

Regards,
Eric

On 1/29/19, 3:44 PM, "Arpit Agarwal"  wrote:

I’ve requested a new repo hadoop-docker-ozone.git in gitbox.


> On Jan 22, 2019, at 4:59 AM, Elek, Marton  wrote:
> 
> 
> 
> TLDR;
> 
> I proposed to create a separated git repository for ozone docker 
images
> in HDDS-851 (hadoop-docker-ozone.git)
> 
> If there is no objections in the next 3 days I will ask an Apache 
Member
> to create the repository.
> 
> 
> 
> 
> LONG VERSION:
> 
> In HADOOP-14898 multiple docker containers and helper scripts are
> created for Hadoop.
> 
> The main goal was to:
> 
> 1.) help the development with easy-to-use docker images
> 2.) provide official hadoop images to make it easy to test new 
features
> 
> As of now we have:
> 
> - apache/hadoop-runner image (which contains the required dependency
> but no hadoop)
> - apache/hadoop:2 and apache/hadoop:3 images (to try out latest hadoop
> from 2/3 lines)
> 
> The base image to run hadoop (apache/hadoop-runner) is also heavily 
used
> for Ozone distribution/development.
> 
> The Ozone distribution contains docker-compose based cluster 
definitions
> to start various type of clusters and scripts to do smoketesting. (See
> HADOOP-16063 for more details).
> 
> Note: I personally believe that these definitions help a lot to start
> different type of clusters. For example it could be tricky to try out
> router based federation as it requires multiple HA clusters. But with 
a
> simple docker-compose definition [1] it could be started under 3
> minutes. (HADOOP-16063 is about creating these definitions for various
> hdfs/yarn use cases)
> 
> As of now we have dedicated branches in the hadoop git repository for
> the docker images (docker-hadoop-runner, docker-hadoop-2,
> docker-hadoop-3). It turns out that a separated repository would be 
more
> effective as the dockerhub can use only full branch names as tags.
> 
> We would like to provide ozone docker images to make the evaluation as
> easy as 'docker run -d apache/hadoop-ozone:0.3.0', therefore in 
HDDS-851
> we agreed to create a separated repository for the hadoop-ozone docker
> images.
> 
> If this approach works well we can also move out the existing
> docker-hadoop-2/docker-hadoop-3/docker-hadoop-runner branches from
> hadoop.git to an other separated hadoop-docker.git repository)
> 
> Please let me know if you have any comments,
> 
> Thanks,
> Marton
> 
> 1: see
> https://github.com/flokkr/runtime-compose/tree/master/hdfs/routerfeder
> as an example
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org





[RESULT] [VOTE] - HDDS-4 Branch merge

2019-01-18 Thread Anu Engineer
With twelve +1 votes (9 Binding and 3 Non-Binding) and no -1 or 0, this
vote passes. Thank you all for voting. We will merge HDDS-4 branch to Ozone
soon.

Thanks
Anu


On Fri, Jan 11, 2019 at 7:40 AM Anu Engineer  wrote:

> Since I have not heard any concerns, I will start a VOTE thread now.
> This vote will run for 7 days and will end on Jan/18/2019 @ 8:00 AM PST.
>
> I will start with my vote, +1 (Binding)
>
> Thanks
> Anu
>
>
> -- Forwarded message -
> From: Anu Engineer 
> Date: Mon, Jan 7, 2019 at 5:10 PM
> Subject: [Discuss] - HDDS-4 Branch merge
> To: , 
>
>
> Hi All,
>
> I would like to propose a merge of HDDS-4 branch to the Hadoop trunk.
> HDDS-4 branch implements the security work for HDDS and Ozone.
>
> HDDS-4 branch contains the following features:
> - Hadoop Kerberos and Tokens support
> - A Certificate infrastructure used by Ozone and HDDS.
> - Audit Logging and parsing support (Spread across trunk and HDDS-4)
> - S3 Security Support - AWS Signature Support.
> - Apache Ranger Support for Ozone
>
> I will follow up with a formal vote later this week if I hear no
> objections. AFAIK, the changes are isolated to HDDS/Ozone and should not
> impact any other Hadoop project.
>
> Thanks
> Anu
>
>


Re: [VOTE] Release Apache Hadoop 3.2.0 - RC1

2019-01-14 Thread Anu Engineer
+1, (Binding)

Deployed a pseudo-distributed cluster.
Tried out HDFS commands and verified everything works.
--Anu


On 1/14/19, 11:26 AM, "Virajith Jalaparti"  wrote:

Thanks Sunil and others who have worked on the making this release happen!

+1 (non-binding)

- Built from source
- Deployed a pseudo-distributed one node cluster
- Ran basic wordcount, sort, pi jobs
- Basic HDFS/WebHDFS commands
- Ran all the ABFS driver tests against an ADLS Gen 2 account in EAST US

Non-blockers (AFAICT): The following tests in ABFS (HADOOP-15407) fail:
- For ACLs ({{ITestAzureBlobFilesystemAcl}}) -- However, I believe these
have been fixed in trunk.
- 
{{ITestAzureBlobFileSystemE2EScale#testWriteHeavyBytesToFileAcrossThreads}}
fails with an OutOfMemoryError exception. I see the same failure on trunk
as well.


On Mon, Jan 14, 2019 at 6:21 AM Elek, Marton  wrote:

> Thanks Sunil to manage this release.
>
> +1 (non-binding)
>
> 1. built from the source (with clean local maven repo)
> 2. verified signatures + checksum
> 3. deployed 3 node cluster to Google Kubernetes Engine with generated
> k8s resources [1]
> 4. Executed basic HDFS commands
> 5. Executed basic yarn example jobs
>
> Marton
>
> [1]: FTR: resources:
> https://github.com/flokkr/k8s/tree/master/examples/hadoop , generator:
> https://github.com/elek/flekszible
>
>
> On 1/8/19 12:42 PM, Sunil G wrote:
> > Hi folks,
> >
> >
> > Thanks to all of you who helped in this release [1] and for helping to
> vote
> > for RC0. I have created second release candidate (RC1) for Apache Hadoop
> > 3.2.0.
> >
> >
> > Artifacts for this RC are available here:
> >
> > http://home.apache.org/~sunilg/hadoop-3.2.0-RC1/
> >
> >
> > RC tag in git is release-3.2.0-RC1.
> >
> >
> >
> > The maven artifacts are available via repository.apache.org at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1178/
> >
> >
> > This vote will run 7 days (5 weekdays), ending on 14th Jan at 11:59 pm
> PST.
> >
> >
> >
> > 3.2.0 contains 1092 [2] fixed JIRA issues since 3.1.0. Below feature
> > additions
> >
> > are the highlights of this release.
> >
> > 1. Node Attributes Support in YARN
> >
> > 2. Hadoop Submarine project for running Deep Learning workloads on YARN
> >
> > 3. Support service upgrade via YARN Service API and CLI
> >
> > 4. HDFS Storage Policy Satisfier
> >
> > 5. Support Windows Azure Storage - Blob file system in Hadoop
> >
> > 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
> >
> > 7. Improvements in Router-based HDFS federation
> >
> >
> >
> > Thanks to Wangda, Vinod, Marton for helping me in preparing the release.
> >
> > I have done few testing with my pseudo cluster. My +1 to start.
> >
> >
> >
> > Regards,
> >
> > Sunil
> >
> >
> >
> > [1]
> >
> >
> 
https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
> >
> > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.2.0)
> > AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status = Resolved
> > ORDER BY fixVersion ASC
> >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


[VOTE] - HDDS-4 Branch merge

2019-01-11 Thread Anu Engineer
Since I have not heard any concerns, I will start a VOTE thread now.
This vote will run for 7 days and will end on Jan/18/2019 @ 8:00 AM PST.

I will start with my vote, +1 (Binding)

Thanks
Anu


-- Forwarded message -
From: Anu Engineer 
Date: Mon, Jan 7, 2019 at 5:10 PM
Subject: [Discuss] - HDDS-4 Branch merge
To: , 


Hi All,

I would like to propose a merge of HDDS-4 branch to the Hadoop trunk.
HDDS-4 branch implements the security work for HDDS and Ozone.

HDDS-4 branch contains the following features:
- Hadoop Kerberos and Tokens support
- A Certificate infrastructure used by Ozone and HDDS.
- Audit Logging and parsing support (Spread across trunk and HDDS-4)
- S3 Security Support - AWS Signature Support.
- Apache Ranger Support for Ozone

I will follow up with a formal vote later this week if I hear no
objections. AFAIK, the changes are isolated to HDDS/Ozone and should not
impact any other Hadoop project.

Thanks
Anu


[Discuss] - HDDS-4 Branch merge

2019-01-07 Thread Anu Engineer
Hi All,

I would like to propose a merge of HDDS-4 branch to the Hadoop trunk.
HDDS-4 branch implements the security work for HDDS and Ozone.

HDDS-4 branch contains the following features:
- Hadoop Kerberos and Tokens support
- A Certificate infrastructure used by Ozone and HDDS.
- Audit Logging and parsing support (Spread across trunk and HDDS-4)
- S3 Security Support - AWS Signature Support.
- Apache Ranger Support for Ozone

I will follow up with a formal vote later this week if I hear no
objections. AFAIK, the changes are isolated to HDDS/Ozone and should not
impact any other Hadoop project.

Thanks
Anu


[jira] [Resolved] (HADOOP-16012) Typo in daemonlog docs

2018-12-15 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HADOOP-16012.
---
Resolution: Fixed

> Typo in daemonlog docs
> --
>
> Key: HADOOP-16012
> URL: https://issues.apache.org/jira/browse/HADOOP-16012
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>    Reporter: Anu Engineer
>Priority: Major
>  Labels: newbie
>
> From the user mailing thread: -- From tzq 
> I might find a spelling error in 
> "http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CommandsManual.html;
> hadoop daemonlog -getlevel   [-protocol (http|https)]
> *hdoop* daemonlog -setlevel[-protocol 
> (http|https)]
> The second hadoop has a typo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16012) Typo in daemonlog docs

2018-12-15 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-16012:
-

 Summary: Typo in daemonlog docs
 Key: HADOOP-16012
 URL: https://issues.apache.org/jira/browse/HADOOP-16012
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Anu Engineer


>From the user mailing thread: -- From tzq 
I might find a spelling error in 
"http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CommandsManual.html;


hadoop daemonlog -getlevel   [-protocol (http|https)]
*hdoop* daemonlog -setlevel[-protocol 
(http|https)]

The second hadoop has a typo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Move to gitbox

2018-12-10 Thread Anu Engineer
+1
--Anu


On 12/10/18, 6:38 PM, "Vinayakumar B"  wrote:

+1

-Vinay

On Mon, 10 Dec 2018, 1:22 pm Elek, Marton 
> Thanks Akira,
>
> +1 (non-binding)
>
> I think it's better to do it now at a planned date.
>
> If I understood well the only bigger task here is to update all the
> jenkins jobs. (I am happy to help/contribute what I can do)
>
>
> Marton
>
> On 12/8/18 6:25 AM, Akira Ajisaka wrote:
> > Hi all,
> >
> > Apache Hadoop git repository is in git-wip-us server and it will be
> > decommissioned.
> > If there are no objection, I'll file a JIRA ticket with INFRA to
> > migrate to https://gitbox.apache.org/ and update documentation.
> >
> > According to ASF infra team, the timeframe is as follows:
> >
> >> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
> relocation
> >> - January 9th -> February 6th: Mandated (coordinated) relocation
> >> - February 7th: All remaining repositories are mass migrated.
> >> This timeline may change to accommodate various scenarios.
> >
> > If we got consensus by January 9th, I can file a ticket with INFRA and
> > migrate it.
> > Even if we cannot got consensus, the repository will be migrated by
> > February 7th.
> >
> > Regards,
> > Akira
> >
> > -
> > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-06 Thread Anu Engineer
Hi Daryn,

I have just started reading the patch. Hence my apologies if my question has a 
response somewhere hidden in the patch.

Are you concerned that FSEditLock is taken in GlobalStateIdContext on Server 
side, and worried that a malicious or stupid client would 
cause this lock to be held up for a long time?

How do retriable exceptions help? Wouldn’t the system eventually hold the lock 
similarly?

I am asking to understand this better so that I get a better sense when I am 
reading the code.

Thanks
Anu


On 12/6/18, 10:38 AM, "Daryn Sharp"  wrote:

-1 pending additional info.  After a cursory scan, I have serious concerns
regarding the design.  This seems like a feature that should have been
purely implemented in hdfs w/o touching the common IPC layer.

The biggest issue in the alignment context.  It's purpose appears to be for
allowing handlers to reinsert calls back into the call queue.  That's
completely unacceptable.  A buggy or malicious client can easily cause
livelock in the IPC layer with handlers only looping on calls that never
satisfy the condition.  Why is this not implemented via RetriableExceptions?

On Thu, Dec 6, 2018 at 1:24 AM Yongjun Zhang 
wrote:

> Great work guys.
>
> Wonder if we can elaborate what's impact of not having #2 fixed, and why 
#2
> is not needed for the feature to complete?
> 2. Need to fix automatic failover with ZKFC. Currently it does not doesn't
> know about ObserverNodes trying to convert them to SBNs.
>
> Thanks.
> --Yongjun
>
>
> On Wed, Dec 5, 2018 at 5:27 PM Konstantin Shvachko 
> wrote:
>
> > Hi Hadoop developers,
> >
> > I would like to propose to merge to trunk the feature branch HDFS-12943
> for
> > Consistent Reads from Standby Node. The feature is intended to scale 
read
> > RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> > NameNode. We should be able to accommodate higher overall RPC workloads
> (up
> > to 4x by some estimates) by adding multiple ObserverNodes.
> >
> > The main functionality has been implemented see sub-tasks of HDFS-12943.
> > We followed up with the test plan. Testing was done on two independent
> > clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> > We ran standard HDFS commands, MR jobs, admin commands including manual
> > failover.
> > We know of one cluster running this feature in production.
> >
> > There are a few outstanding issues:
> > 1. Need to provide proper documentation - a user guide for the new
> feature
> > 2. Need to fix automatic failover with ZKFC. Currently it does not
> doesn't
> > know about ObserverNodes trying to convert them to SBNs.
> > 3. Scale testing and performance fine-tuning
> > 4. As testing progresses, we continue fixing non-critical bugs like
> > HDFS-14116.
> >
> > I attached a unified patch to the umbrella jira for the review and
> Jenkins
> > build.
> > Please vote on this thread. The vote will run for 7 days until Wed Dec
> 12.
> >
> > Thanks,
> > --Konstantin
> >
>


-- 

Daryn




Re: [VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC1)

2018-11-19 Thread Anu Engineer
+1. (Binding)

Thanks for getting this release done. Verified the signatures and S3 Gateway.

--Anu


On 11/16/18, 5:15 AM, "Shashikant Banerjee"  wrote:

+1 (non-binding).

  - Verified signatures
  - Verified checksums
  - Checked LICENSE/NOTICE files
  - Built from source
  - Ran smoke tests.

Thanks Marton for putting up the release together.

Thanks
Shashi

On 11/14/18, 10:44 PM, "Elek, Marton"  wrote:

Hi all,

I've created the second release candidate (RC1) for Apache Hadoop Ozone
0.3.0-alpha including one more fix on top of the previous RC0 (HDDS-854)

This is the second release of Apache Hadoop Ozone. Notable changes since
the first release:

* A new S3 compatible rest server is added. Ozone can be used from any
S3 compatible tools (HDDS-434)
* Ozone Hadoop file system URL prefix is renamed from o3:// to o3fs://
(HDDS-651)
* Extensive testing and stability improvements of OzoneFs.
* Spark, YARN and Hive support and stability improvements.
* Improved Pipeline handling and recovery.
* Separated/dedicated classpath definitions for all the Ozone
components. (HDDS-447)

The RC artifacts are available from:
https://home.apache.org/~elek/ozone-0.3.0-alpha-rc1/

The RC tag in git is: ozone-0.3.0-alpha-RC1 (ebbf459e6a6)

Please try it out, vote, or just give us feedback.

The vote will run for 5 days, ending on November 19, 2018 18:00 UTC.


Thank you very much,
Marton


PS:

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs from ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d
4. open localhost:9874 or localhost:9876



The easiest way to try it out from the source:

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose up -d



The easiest way to test basic functionality (with acceptance tests):

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
3. ./test.sh

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




Re: [VOTE] Release Apache Hadoop Ozone 0.2.1-alpha (RC0)

2018-09-25 Thread Anu Engineer
Hi Marton,

+1 (binding)

1. Verified the Signature
2. Verified the Checksums - MD5 and Sha*
3. Build from Sources.
4. Ran all RPC and REST commands against the cluster via Robot.
5. Tested the OzoneFS functionality

Thank you very much for creating the first release of Ozone.

--Anu


On 9/19/18, 2:49 PM, "Elek, Marton"  wrote:

Hi all,

After the recent discussion about the first Ozone release I've created 
the first release candidate (RC0) for Apache Hadoop Ozone 0.2.1-alpha.

This release is alpha quality: it’s not recommended to use in production 
but we believe that it’s stable enough to try it out the feature set and 
collect feedback.

The RC artifacts are available from: 
https://home.apache.org/~elek/ozone-0.2.1-alpha-rc0/

The RC tag in git is: ozone-0.2.1-alpha-RC0 (968082ffa5d)

Please try the release and vote; the vote will run for the usual 5 
working days, ending on September 26, 2018 10pm UTC time.

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs at ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d


Please try it out, vote, or just give us feedback.

Thank you very much,
Marton

ps: At next week, we will have a BoF session at ApacheCon North Europe, 
Montreal on Monday evening. Please join, if you are interested, or need 
support to try out the package or just have any feedback.


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org





Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-22 Thread Anu Engineer
I believe that you need to regenerate the site using ‘hugo’ command (hugo is a 
site builder). Then commit and push the generated files.

Thanks
Anu


On 9/22/18, 9:56 AM, "俊平堵"  wrote:

Martin, thanks for your reply. It works now, but after git changes - I
haven’t seen Apache Hadoop website get refreshed. It seems like to need
some manually steps to refresh the website -if so, can you also update to
the wiki?

Thanks,

Junping

Elek, Marton 于2018年9月20日 周四下午1:40写道:

> Please try
>
> git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
>
> (It seems git tries to check out master instead of the branch).
>
> I updated the wiki, sorry for the inconvenience.
>
> Marton
>
> On 9/18/18 8:05 PM, 俊平堵 wrote:
> > Hey Marton,
> >   The new release web-site actually doesn't work for me.  When I
> > follow your steps in wiki, and hit the issue during git clone repository
> > (writable) for hadoop-site as below:
> >
> > git clone https://gitbox.apache.org/repos/asf/hadoop-site.git
> > Cloning into 'hadoop-site'...
> > remote: Counting objects: 252414, done.
> > remote: Compressing objects: 100% (29625/29625), done.
> > remote: Total 252414 (delta 219617), reused 252211 (delta 219422)
> > Receiving objects: 100% (252414/252414), 98.78 MiB | 3.32 MiB/s, done.
> > Resolving deltas: 100% (219617/219617), done.
> > warning: remote HEAD refers to nonexistent ref, unable to checkout.
> >
> > Can you check above repository is correct for clone?
> > I can clone readable repository (https://github.com/apache/hadoop-site)
> > successfully though but cannot push back changes which is expected.
> >
> > Thanks,
> >
> > Junping
> >
> > Elek, Marton mailto:e...@apache.org>>于2018年9月17日
> > 周一上午6:15写道:
> >
> > Hi Junping,
> >
> > Thank you to work on this release.
> >
> > This release is the first release after the hadoop site change, and 
I
> > would like to be sure that everything works fine.
> >
> > Unfortunately I didn't get permission to edit the old wiki, but this
> is
> > definition of the site update on the new wiki:
> >
> >
> 
https://cwiki.apache.org/confluence/display/HADOOP/How+to+generate+and+push+ASF+web+site+after+HADOOP-14163
> >
> > Please let me know if something is not working for you...
> >
> > Thanks,
> > Marton
> >
> >
> > On 09/10/2018 02:00 PM, 俊平堵 wrote:
> >  > Hi all,
> >  >
> >  >   I've created the first release candidate (RC0) for Apache
> >  > Hadoop 2.8.5. This is our next point release to follow up 2.8.4.
> It
> >  > includes 33 important fixes and improvements.
> >  >
> >  >
> >  >  The RC artifacts are available at:
> >  > http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
> >  >
> >  >
> >  >  The RC tag in git is: release-2.8.5-RC0
> >  >
> >  >
> >  >
> >  >  The maven artifacts are available via repository.apache.org
> > <
> >  > http://repository.apache.org> at:
> >  >
> >  >
> >
> https://repository.apache.org/content/repositories/orgapachehadoop-1140
> >  >
> >  >
> >  >  Please try the release and vote; the vote will run for the
> > usual 5 working
> >  > days, ending on 9/15/2018 PST time.
> >  >
> >  >
> >  > Thanks,
> >  >
> >  >
> >  > Junping
> >  >
> >
>




Re: Checkstyle shows false positive report

2018-08-15 Thread Anu Engineer
Just reverted, Thanks for root causing this.

Thanks
Anu


On 8/15/18, 9:37 AM, "Allen Wittenauer"  
wrote:


> On Aug 15, 2018, at 4:49 AM, Kitti Nánási  
wrote:
> 
> Hi All,
> 
> We noticed that the checkstyle run by the pre commit job started to show
> false positive reports, so I created HADOOP-15665
> .
> 
> Until that is fixed, keep in mind to run the checkstyle by your IDE
> manually for the patches you upload or review.


I’ve tracked it down to HDDS-119.  I have no idea why that JIRA Is 
changing the checkstyle suppressions file, since the asf license check is it’s 
own thing and check style wouldn’t be looking at those files anyway.

That said, there is a bug in Yetus in that it should have reported that 
checkstyle failed to run. I’ve filed YETUS-660 for that.
-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org





Re: [DISCUSS] Alpha Release of Ozone

2018-08-08 Thread Anu Engineer
Thanks for reporting this issue. I have filed a JIRA to address this issue.

https://issues.apache.org/jira/browse/HDDS-341

>So, consider this as a report. IMHO, cutting an Ozone release prior to a 
>Hadoop release ill-advised given the distribution impact and the requirements 
>of the merge vote.  

The Ozone release is being planned to address issues like these; In my mind if 
we go thru a release exercise, we will be able to identify all ozone and Hadoop 
related build and release issues. 
Ozone will tremendously benefit from a release exercise and the community 
review that comes from that.
 
Thanks
Anu


On 8/8/18, 1:19 PM, "Allen Wittenauer"  wrote:



> On Aug 8, 2018, at 12:56 PM, Anu Engineer  
wrote:
> 
>> Has anyone verified that a Hadoop release doesn't have _any_ of the 
extra ozone bits that are sprinkled outside the maven modules?
> As far as I know that is the state, we have had multiple Hadoop releases 
after ozone has been merged. So far no one has reported Ozone bits leaking into 
Hadoop. If we find something like that, it would be a bug.

There hasn't been a release from a branch where Ozone has been merged 
yet. The first one will be 3.2.0.  Running create-release off of trunk 
presently shows bits of Ozone in dev-support, hadoop-dist, and elsewhere in the 
Hadoop source tar ball. 

So, consider this as a report. IMHO, cutting an Ozone release prior to 
a Hadoop release ill-advised given the distribution impact and the requirements 
of the merge vote.  





Re: [DISCUSS] Alpha Release of Ozone

2018-08-08 Thread Anu Engineer
> Given that there are some Ozone components spread out past the core maven 
> modules, is the plan to release a Hadoop Trunk + Ozone tar ball or is more 
> work going to go into segregating the Ozone components prior to release?
The official release will be a source tarball, we intend to release an ozone 
only binaries to make it easy to for people to deploy. We are still formulating 
the plans and you are welcome to leave your comments on HDDS-214.

> Has anyone verified that a Hadoop release doesn't have _any_ of the extra 
> ozone bits that are sprinkled outside the maven modules?
As far as I know that is the state, we have had multiple Hadoop releases after 
ozone has been merged. So far no one has reported Ozone bits leaking into 
Hadoop. If we find something like that, it would be a bug.

Thanks
Anu




On 8/8/18, 12:04 PM, "Allen Wittenauer"  
wrote:


Given that there are some Ozone components spread out past the core maven 
modules, is the plan to release a Hadoop Trunk + Ozone tar ball or is more work 
going to go into segregating the Ozone components prior to release? Has anyone 
verified that a Hadoop release doesn't have _any_ of the extra ozone bits that 
are sprinkled outside the maven modules?
-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [DISCUSS] Alpha Release of Ozone

2018-08-06 Thread Anu Engineer
+1,  It will allow many users to get a first look at Ozone/HDDS. 

Thanks
Anu


On 8/6/18, 10:34 AM, "Elek, Marton"  wrote:

Hi All,

I would like to discuss creating an Alpha release for Ozone. The core 
functionality of Ozone is complete but there are two missing features; 
Security and HA, work on these features are progressing in Branches 
HDDS-4 and HDDS-151. Right now, Ozone can handle millions of keys and 
has a Hadoop compatible file system, which allows applications like 
Hive, Spark, and YARN use Ozone.

Having an Alpha release of Ozone will help in getting some early 
feedback (this release will be marked as an Alpha -- and not production 
ready).

Going through a complete release cycle will help us flush out Ozone 
release process, update user documentation and nail down deployment models.

Please share your thoughts on the Alpha release (over mail or in 
HDDS-214), as voted on by the community earlier, Ozone release will be 
independent of Hadoop releases.

Thanks a lot,
Marton Elek




-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org





Re: [VOTE] reset/force push to clean up inadvertent merge commit pushed to trunk

2018-07-06 Thread Anu Engineer
@Sunil G +1, Thanks for fixing this issue.

--Anu


On 7/6/18, 11:12 AM, "Sunil G"  wrote:

I just checked.  YARN-7556 and YARN-7451 can be cherry-picked.
I cherry-picked in my local and compiled. Things are good.

I can push this now  which will restore trunk to its original.
I can do this if there are no objection.

- Sunil

On Fri, Jul 6, 2018 at 11:10 AM Arpit Agarwal 
wrote:

> afaict YARN-8435 is still in trunk. YARN-7556 and YARN-7451 are not.
>
>
> From: Giovanni Matteo Fumarola 
> Date: Friday, July 6, 2018 at 10:59 AM
> To: Vinod Kumar Vavilapalli 
> Cc: Anu Engineer , Arpit Agarwal <
> aagar...@hortonworks.com>, "su...@apache.org" , "
> yarn-...@hadoop.apache.org" , "
> hdfs-...@hadoop.apache.org" , "
> common-dev@hadoop.apache.org" , "
> mapreduce-...@hadoop.apache.org" 
> Subject: Re: [VOTE] reset/force push to clean up inadvertent merge commit
> pushed to trunk
>
> Everything seems ok except the 3 commits: YARN-8435, YARN-7556, YARN-7451
> are not anymore in trunk due to the revert.
>
> Haibo/Robert if you can recommit your patches I will commit mine
> subsequently to preserve the original order.
>
> (My apology for the mess I did with the merge commit)
>
> On Fri, Jul 6, 2018 at 10:42 AM, Vinod Kumar Vavilapalli <
> vino...@apache.org<mailto:vino...@apache.org>> wrote:
> I will add that the branch also successfully compiles.
>
> Let's just move forward as is, unblock commits and just fix things if
> anything is broken.
>
> +Vinod
>
> > On Jul 6, 2018, at 10:30 AM, Anu Engineer  <mailto:aengin...@hortonworks.com>> wrote:
> >
> > Hi All,
> >
> > [ Thanks to Arpit for working offline and verifying that branch is
> indeed good.]
> >
> > I want to summarize what I know of this issue and also solicit other
> points of view.
> >
> > We reverted the commit(c163d1797) from the branch, as soon as we noticed
> it. That is, we have made no other commits after the merge commit.
> >
> > We used the following command to revert
> > git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1
> >
> > Giovanni's branch had three commits + merge, The JIRAs he had were
> YARN-7451, YARN-7556, YARN-8435.
> >
> > The issue seems to be the revert of merge has some diffs. I am not a
> YARN developer, so the only problem is to look at the revert and see if
> there were any spurious edits in Giovanni's original commit + merge.
> > If there are none, we don't need a reset/force push.  But if we find an
> issue I am more than willing to go the force commit route.
> >
> > The revert takes the trunk back to the point of the first commit from
> Giovanni which is YARN-8435. His branch was also rewriting the order of
> commits which we have lost due to the revert.
> >
> > Based on what I know so far, I am -1 on the force push.
> >
> > In other words, I am trying to understand why we need the force push. I
> have left a similar comment in JIRA (
> https://issues.apache.org/jira/browse/INFRA-16727) too.
> >
> >
> > Thanks
> > Anu
> >
> >
> > On 7/6/18, 10:24 AM, "Arpit Agarwal"  aagar...@hortonworks.com>> wrote:
> >
> >-1 for the force push. Nothing is broken in trunk. The history looks
> ugly for two commits and we can live with it.
> >
> >The revert restored the branch to Giovanni's intent. i.e. only
> YARN-8435 is applied. Verified there is no delta between hashes 0d9804d 
and
> 39ad989 (HEAD).
> >
> >39ad989 2018-07-05 aengineer@ o {apache/trunk} Revert "Merge branch
> 't...
> >c163d17 2018-07-05 gifuma@apa M─┐ Merge branch 'trunk' of
> https://git-...
> >99febe7 2018-07-05 rkanter@ap │ o YARN-7451. Add missing tests to
> veri...
> >1726247 2018-07-05 haibochen@ │ o YARN-7556. Fair scheduler
> configurat...
> >0d9804d 2018-07-05 gifuma@apa o │ YARN-8435. Fix NPE when the same
> cli...
> >71df8c2 2018-07-05 nanda@apac o─┘ HDDS-212. Introduce
> NodeStateManager...
> >
> >Regards,
> >Arpit
> >
> >
> >On 7/5/18, 2:37 PM, "Subru Krishnan"  su...@apache.org>> wrote:
> >

Re: [VOTE] reset/force push to clean up inadvertent merge commit pushed to trunk

2018-07-06 Thread Anu Engineer
Hi All,

[ Thanks to Arpit for working offline and verifying that branch is indeed good.]

I want to summarize what I know of this issue and also solicit other points of 
view.

We reverted the commit(c163d1797) from the branch, as soon as we noticed it. 
That is, we have made no other commits after the merge commit.

We used the following command to revert 
git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1

Giovanni's branch had three commits + merge, The JIRAs he had were YARN-7451, 
YARN-7556, YARN-8435.

The issue seems to be the revert of merge has some diffs. I am not a YARN 
developer, so the only problem is to look at the revert and see if there were 
any spurious edits in Giovanni's original commit + merge. 
If there are none, we don't need a reset/force push.  But if we find an issue I 
am more than willing to go the force commit route.

The revert takes the trunk back to the point of the first commit from Giovanni 
which is YARN-8435. His branch was also rewriting the order of commits which we 
have lost due to the revert.
 
Based on what I know so far, I am -1 on the force push.

In other words, I am trying to understand why we need the force push. I have 
left a similar comment in JIRA 
(https://issues.apache.org/jira/browse/INFRA-16727) too.


Thanks
Anu


On 7/6/18, 10:24 AM, "Arpit Agarwal"  wrote:

-1 for the force push. Nothing is broken in trunk. The history looks ugly 
for two commits and we can live with it.

The revert restored the branch to Giovanni's intent. i.e. only YARN-8435 is 
applied. Verified there is no delta between hashes 0d9804d and 39ad989 (HEAD).

39ad989 2018-07-05 aengineer@ o {apache/trunk} Revert "Merge branch 't...
c163d17 2018-07-05 gifuma@apa M─┐ Merge branch 'trunk' of https://git-...
99febe7 2018-07-05 rkanter@ap │ o YARN-7451. Add missing tests to veri...
1726247 2018-07-05 haibochen@ │ o YARN-7556. Fair scheduler configurat...
0d9804d 2018-07-05 gifuma@apa o │ YARN-8435. Fix NPE when the same cli...
71df8c2 2018-07-05 nanda@apac o─┘ HDDS-212. Introduce NodeStateManager...

Regards,
Arpit


On 7/5/18, 2:37 PM, "Subru Krishnan"  wrote:

Folks,

There was a merge commit accidentally pushed to trunk, you can find the
details in the mail thread [1].

I have raised an INFRA ticket [2] to reset/force push to clean up trunk.

Can we have a quick vote for INFRA sign-off to proceed as this is 
blocking
all commits?

Thanks,
Subru

[1]

http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201807.mbox/%3CCAHqguubKBqwfUMwhtJuSD7X1Bgfro_P6FV%2BhhFhMMYRaxFsF9Q%40mail.gmail.com%3E
[2] https://issues.apache.org/jira/browse/INFRA-16727



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




Re: Merge branch commit in trunk by mistake

2018-07-05 Thread Anu Engineer
I ran  “git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1”

that will remove all changes from Giovanni’s branch (There are 3 YARN commits). 
I am presuming that he can recommit the dropped changes directly into trunk.

I do not know off a better way than to lose changes from his branch. I am open 
to force pushing if that is needed.

--Anu


On 7/5/18, 2:20 PM, "Wangda Tan"  wrote:

Adding back hdfs/common/mr-dev again to cc list.

Here's the last merge revert commit:

https://github.com/apache/hadoop/commit/39ad98903a5f042573b97a2e5438bc57af7cc7a1


On Thu, Jul 5, 2018 at 2:17 PM Wangda Tan  wrote:

> It looks like the latest revert is not correct, many of commits get
> reverted.
>
> Dealing with merge commit revert is different from reverting a normal
> commit: https://www.christianengvall.se/undo-pushed-merge-git/
>
> We have to do force reset, now it is a complete mess in trunk.
>
>
>
> On Thu, Jul 5, 2018 at 2:10 PM Vinod Kumar Vavilapalli 

> wrote:
>
>> What is broken due to this merge commit?
>>
>> +Vinod
>>
>> > On Jul 5, 2018, at 2:03 PM, Arun Suresh  wrote:
>> >
>> > I agree with Sean, to be honest.. it is disruptive.
>> > Also, we have to kind of lock down the repo till it is completed..
>> >
>> > I recommend we be careful and try not to get into this situation 
again..
>> >
>> > -1 on force pushing..
>> >
>> > Cheers
>> > -Arun
>> >
>> > On Thu, Jul 5, 2018, 1:55 PM Sean Busbey  wrote:
>> >
>> >> If we need a vote, please have a thread with either DISCUSS or
>> >> preferably VOTE in the subject so folks are more likely to see it.
>> >>
>> >> that said, I'm -1 (non-binding). force pushes are extremely
>> >> disruptive. there's no way to know who's updated their local git repo
>> >> to include these changes in the last few hours. if a merge commit is
>> >> so disruptive that we need to subject folks to the inconvenience of a
>> >> force push then we should have more tooling in place to avoid them
>> >> (like client side git hooks for all committers).
>> >>
>> >> On Thu, Jul 5, 2018 at 3:36 PM, Wangda Tan 
>> wrote:
>> >>> +1 for force reset the branch.
>> >>>
>> >>> On Thu, Jul 5, 2018 at 12:14 PM Subru Krishnan 
>> wrote:
>> >>>
>>  Looking at the merge commit, I feel it's better to reset/force push
>>  especially since this is still the latest commit on trunk.
>> 
>>  I have raised an INFRA ticket requesting the same:
>>  https://issues.apache.org/jira/browse/INFRA-16727
>> 
>>  -S
>> 
>>  On Thu, Jul 5, 2018 at 11:45 AM, Sean Busbey
>> >> 
>>  wrote:
>> 
>> > FYI, no images make it through ASF mailing lists. I presume the
>> image
>> >> was
>> > of the git history? If that's correct, here's what that looks like
>> in
>> >> a
>> > paste:
>> >
>> > https://paste.apache.org/eRix
>> >
>> > There are no force pushes on trunk, so backing the change out would
>>  require
>> > the PMC asking INFRA to unblock force pushes for a period of time.
>> >
>> > Probably the merge commit isn't a big enough deal to do that. There
>> >> was a
>> > merge commit ~5 months ago for when YARN-6592 merged into trunk.
>> >
>> > So I'd say just try to avoid doing it in the future?
>> >
>> > -busbey
>> >
>> > On Thu, Jul 5, 2018 at 1:31 PM, Giovanni Matteo Fumarola <
>> > giovanni.fumar...@gmail.com> wrote:
>> >
>> >> Hi folks,
>> >>
>> >> After I pushed something on trunk a merge commit showed up in the
>> > history. *My
>> >> bad*.
>> >>
>> >>
>> >>
>> >> Since it was one of my first patches, I run a few tests on my
>> >> machine
>> >> before checked in.
>> >> While I was running all the tests, someone else checked in. I
>> >> correctly
>> >> pulled all the new changes.
>> >>
>> >> Even before I did the "git push" there was no merge commit in my
>>  history.
>> >>
>> >> Can someone help me reverting this change?
>> >>
>> >> Thanks
>> >> Giovanni
>> >>
>> >>
>> >>
>> >
>> >
>> > --
>> > busbey
>> >
>> 
>> >>
>> >>
>> >>
>> >> --
>> >> busbey
>> >>
>>
>>



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Anu Engineer
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone 
roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this 
feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey"  wrote:

I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon  
wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine 
within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what 
is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs 
are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured 
to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the 
only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>>
>> Hi Steve -
>>
>> This is a long overdue DISCUSS thread!
>>
>> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED 
UI
>> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>> to get to the page like SSL exceptions in the browser do?
>> Similar tactic for UI access without SSL?
>> A new AuthenticationFilter can be added to the filter chains that
>> blocks
>> API calls unless explicitly configured to be open and obvious log a
>> similar
>> message?
>>
>> thanks,
>>
>> --larry
>>
>>
>>
>>
>> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> ste...@hortonworks.com>
>> wrote:
>>
>> > Bitcoins are profitable enough to justify writing 

Re: Merge branch commit in trunk by mistake

2018-07-05 Thread Anu Engineer
Based on conversations with Giovanni and Subru, I have pushed a revert for this 
merge.

Thanks
Anu


On 7/5/18, 12:55 PM, "Giovanni Matteo Fumarola"  
wrote:

+ common-dev and hdfs-dev as fyi.

Thanks Subru and Sean for the answer.

On Thu, Jul 5, 2018 at 12:14 PM, Subru Krishnan  wrote:

> Looking at the merge commit, I feel it's better to reset/force push
> especially since this is still the latest commit on trunk.
>
> I have raised an INFRA ticket requesting the same:
> https://issues.apache.org/jira/browse/INFRA-16727
>
> -S
>
> On Thu, Jul 5, 2018 at 11:45 AM, Sean Busbey 
> wrote:
>
>> FYI, no images make it through ASF mailing lists. I presume the image was
>> of the git history? If that's correct, here's what that looks like in a
>> paste:
>>
>> https://paste.apache.org/eRix
>>
>> There are no force pushes on trunk, so backing the change out would
>> require
>> the PMC asking INFRA to unblock force pushes for a period of time.
>>
>> Probably the merge commit isn't a big enough deal to do that. There was a
>> merge commit ~5 months ago for when YARN-6592 merged into trunk.
>>
>> So I'd say just try to avoid doing it in the future?
>>
>> -busbey
>>
>> On Thu, Jul 5, 2018 at 1:31 PM, Giovanni Matteo Fumarola <
>> giovanni.fumar...@gmail.com> wrote:
>>
>> > Hi folks,
>> >
>> > After I pushed something on trunk a merge commit showed up in the
>> history. *My
>> > bad*.
>> >
>> >
>> >
>> > Since it was one of my first patches, I run a few tests on my machine
>> > before checked in.
>> > While I was running all the tests, someone else checked in. I correctly
>> > pulled all the new changes.
>> >
>> > Even before I did the "git push" there was no merge commit in my
>> history.
>> >
>> > Can someone help me reverting this change?
>> >
>> > Thanks
>> > Giovanni
>> >
>> >
>> >
>>
>>
>> --
>> busbey
>>
>
>




Re: [VOTE] Merge ContainerIO branch (HDDS-48) in to trunk

2018-06-29 Thread Anu Engineer
+1, I have code reviewed many of these changes and it is an essential set of 
changes for HDDS/Ozone.
Thank you for getting this done.

Thanks
Anu


On 6/29/18, 3:14 PM, "Bharat Viswanadham"  wrote:

Fixing subject line of the mail.


Thanks,
Bharat



On 6/29/18, 3:10 PM, "Bharat Viswanadham"  
wrote:

Hi All,

Given the positive response to the discussion thread [1], here is the 
formal vote thread to merge HDDS-48 in to trunk.

Summary of code changes:
1. Code changes for this branch are done in the hadoop-hdds subproject 
and hadoop-ozone subproject, there is no impact to hadoop-hdfs.
2. Added support for multiple container types in the datanode code path.
3. Added disk layout logic for the containers to supports future 
upgrades.
4. Added support for volume Choosing policy to distribute containers 
across disks on the datanode.
5. Changed the format of the .container file to a human-readable format 
(yaml)


 The vote will run for 7 days, ending Fri July 6th. I will start this 
vote with my +1.

Thanks,
Bharat

[1] 
https://lists.apache.org/thread.html/79998ebd2c3837913a22097102efd8f41c3b08cb1799c3d3dea4876b@%3Chdfs-dev.hadoop.apache.org%3E




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-19 Thread Anu Engineer
Hi Owen,

   Thanks for the proposal. I was hoping for same releases, but I am okay with 
different releases as well. 
   @Konstantin, I am completely open to the name changes, let us discuss that 
in HDFS-10419 
  and we can make the corresponding change.

--Anu


On 3/19/18, 10:52 AM, "Owen O'Malley"  wrote:

Andrew and Daryn,
   Do you have any feedback on the proposal? Otherwise, we can start a vote
for "adoption of new codebase" tomorrow.

.. Owen

On Wed, Mar 14, 2018 at 1:50 PM, Owen O'Malley 
wrote:

> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL
> development while minimizing the potential for disruption of HDFS
> development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>Owen
>




Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-02 Thread Anu Engineer
Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for 
you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan"  wrote:

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda


On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley 
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang 
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>- Same source tree, same releases - examples like HDFS & YARN
>- Same master branch, separate releases and release branches - Hive's
>Storage API vs Hive. It is in the source tree for the master branch, 
but
>has distinct releases and release branches.
>- Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use 
the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a 
top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing 
up
> to make much more regular bugfix and minor releases in the near future. 
For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>




Namenode RPC default port reverted to 8020

2018-02-06 Thread Anu Engineer
Hi All,

I wanted to bring to your attention that HDFS-12990 has been committed to trunk 
and branch 3.0.1.

This change reverts the Namenode RPC port to the familiar 8020, making it same 
as Apache Hadoop 2.x series.
In Hadoop 3.0.0 release, the default port is 9820. If you have deployed Hadoop 
3.0.0, then please reconfigure Namenode port to 8020.

If you have not deployed Hadoop 3.0.0, we recommend waiting for Hadoop 3.0.1 
which is planned to be released in next two weeks.

Thanks
Anu


Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Anu Engineer
Hi Eddy,

Thanks for driving this release. Just a quick question, do we have time to 
close this issue? 
https://issues.apache.org/jira/browse/HDFS-12990

or are we abandoning it? I believe that this is the last window for us to fix 
this issue.

Should we have a call and get this resolved one way or another?

Thanks
Anu

On 2/1/18, 10:51 AM, "Lei Xu"  wrote:

Hi, All

I just cut branch-3.0.1 from branch-3.0.  Please make sure all patches
targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.

Thanks!
Eddy

On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


[jira] [Created] (HADOOP-15204) Add Configuration API for parsing storage sizes

2018-01-31 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-15204:
-

 Summary: Add Configuration API for parsing storage sizes
 Key: HADOOP-15204
 URL: https://issues.apache.org/jira/browse/HADOOP-15204
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.1.0
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 3.1.0


Hadoop has a lot of configurations that specify memory and disk size. This JIRA 
proposes to add an API like {{Configuration.getStorageSize}} which will allow 
users
to specify units like KB, MB, GB etc. This is JIRA is inspired by HDFS-8608 and 
Ozone. Adding {{getTimeDuration}} support was a great improvement for ozone 
code base, this JIRA hopes to do the same thing for configs that deal with disk 
and memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15128) TestViewFileSystem tests are broken in trunk

2017-12-18 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-15128:
-

 Summary: TestViewFileSystem tests are broken in trunk
 Key: HADOOP-15128
 URL: https://issues.apache.org/jira/browse/HADOOP-15128
 Project: Hadoop Common
  Issue Type: Bug
  Components: viewfs
Affects Versions: 3.1.0
Reporter: Anu Engineer
Assignee: Hanisha Koneru


The fix in Hadoop-10054 seems to have caused a test failure. Please take a 
look. Thanks [~eyang] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.9.0 (RC2)

2017-11-13 Thread Anu Engineer
-1 (binding)

Thank you for all the hard work on 2.9 series. Unfortunately, this is one of 
the times I have to -1 this release.

Looks like HADOOP-14840 added a dependency on “oj! Algorithms - version 43.0”, 
but we have just added “oj! Algorithms - version 43.0” to the 
“LICENSE.txt”. The right addition to the LICENESE.txt should contain the 
original MIT License, especially “Copyright (c) 2003-2017 Optimatika”.

Please take a look at https://github.com/optimatika/ojAlgo/blob/master/LICENSE

I am a +1 after this is fixed.

Thanks 
Anu




On 11/13/17, 9:50 AM, "Sunil G"  wrote:

+1 (binding)

Deployed cluster built from source.



   - Tested few cases in an HA cluster and tried to do failover by using
   rmadmin commands etc. This seems works fine including submitting apps.
   - I also tested many MR apps and all are running fine w/o any issues.
   - Majorly tested below feature sanity too (works fine)
  - Application priority
  - Application timeout
   - Tested basic NodeLabel scenarios.
  - Added some labels to couple of nodes
  - Verified old UI for labels
  - Submitted apps to labelled cluster and it works fine.
  - Also performed few cli commands related to nodelabel
   - Verified new YARN UI and accessed various pages when cluster was in
   use. It seems fine to me.


Thanks all folks who participated in this release, appreciate the same!

- Sunil


On Mon, Nov 13, 2017 at 3:01 AM Subru Krishnan  wrote:

> Hi Folks,
>
> Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be 
the
> starting release for Apache Hadoop 2.9.x line - it includes 30 New 
Features
> with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since
> 2.8.2.
>
> More information about the 2.9.0 release plan can be found here:
> *
> 
https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#Roadmap-Version2.9
> <
> 
https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#Roadmap-Version2.9
> >*
>
> New RC is available at: http://home.apache.org/~asuresh/hadoop-2.9.0-RC2/
> <
> 
http://www.google.com/url?q=http%3A%2F%2Fhome.apache.org%2F~asuresh%2Fhadoop-2.9.0-RC1%2F=D=1=AFQjCNE7BF35IDIMZID3hPqiNglWEVsTpg
> >
>
> The RC tag in git is: release-2.9.0-RC2, and the latest commit id is:
> 1eb05c1dd48fbc9e4b375a76f2046a59103bbeb1.
>
> The maven artifacts are available via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1067/
> <
> 
https://www.google.com/url?q=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066=D=1=AFQjCNFcern4uingMV_sEreko_zeLlgdlg
> >
>
> Please try the release and vote; the vote will run for the usual 5 days,
> ending on Friday 17th November 2017 2pm PT time.
>
> We want to give a big shout out to Sunil, Varun, Rohith, Wangda, Vrushali
> and Inigo for the extensive testing/validation which helped prepare for
> RC2. Do report your results in this vote as it'll be very useful to the
> entire community.
>
> Thanks,
> -Subru/Arun
>




Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread Anu Engineer
Hi Steve,

In addition to everything Weiwei mentioned (chapter 3 of user guide), if you 
really want to drill down to REST protocol you might want to apply this patch 
and build ozone.

https://issues.apache.org/jira/browse/HDFS-12690

This will generate an Open API (https://www.openapis.org , http://swagger.io) 
based specification which can be accessed from KSM UI or just as a json file.
Unfortunately, this patch is still at code review stage, so you will have to 
apply the patch and build it yourself. 

Thanks
Anu


On 10/20/17, 6:09 AM, "Yang Weiwei"  wrote:

Hi Steve


The code is available in HDFS-7240 feature branch, public git repo 
here.

I am not sure if there is a "public" API for object stores, but the design 
doc
 uses most common syntax so I believe it should be compliance. You can find the 
rest API doc 
here
 (with some example usages), and commandline API 
here.


Look forward for your feedback!


--Weiwei



发件人: Steve Loughran 
发送时间: 2017年10月20日 11:49
收件人: Yang Weiwei
抄送: hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; common-dev@hadoop.apache.org
主题: Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk


Wow, big piece of work

1. Where is a PR/branch on github with rendered docs for us to look at?
2. Have you made any public APi changes related to object stores? That's 
probably something I'll have opinions on more than implementation details.

thanks

> On 19 Oct 2017, at 02:54, Yang Weiwei  wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to 
trunk. This feature implements an object store which can co-exist with HDFS. 
Ozone is disabled by default. We have tested Ozone with cluster sizes varying 
from 1 to 100 data nodes.
>
>
>
> The merge payload includes the following:
>
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation, 
passes all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
>
>
> To build Ozone and run ozone using docker, please follow instructions in 
this wiki page. 
https://cwiki.apache.org/confluence/display/HADOOP/Dev+cluster+with+docker.
Dev cluster with docker - Hadoop - Apache Software 
Foundation
cwiki.apache.org
First, it uses a much more smaller common image which doesn't contains 
Hadoop. Second, the real Hadoop should be built from the source and the dist 
director should be ...



>
>
> We have built a passionate and diverse community to drive this feature 
development. As a team, we have achieved significant progress in past 3 years 
since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we have resolved 
almost 400 JIRAs by 20+ contributors/committers from different countries and 
affiliations. We also want to thank the large number of community members who 
were supportive of our efforts and contributed ideas and participated in the 
design of ozone.
>
>
> Please share your thoughts, thanks!
>
>
> -- Weiwei Yang




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: Access to confluence wiki

2017-09-11 Thread Anu Engineer
Can I please get access too, Confluence user name : anu

Thanks
Anu





On 9/11/17, 4:08 PM, "Arun Suresh"  wrote:

>I've added Subru and Chris.
>
>@Vrushali, I could not find your username. I think you need to sign up to
>confluence first with your apache id.
>
>Cheers
>-Arun
>
>
>
>On Mon, Sep 11, 2017 at 3:36 PM, Vrushali C  wrote:
>
>> I would like to get access to the Hadoop Wiki as well. My apache id is
>> "vrushali".
>>
>> thanks
>> Vrushali
>>
>>
>> On Mon, Sep 11, 2017 at 3:33 PM, Subru Krishnan  wrote:
>>
>> > Hi,
>> >
>> > Can I get access to the Hadoop wiki. My confluence is "subru".
>> >
>> > TIA.
>> >
>> > -Subru
>> >
>>


Re: [DISCUSS] Looking to Apache Hadoop 3.1 release

2017-09-06 Thread Anu Engineer
Hi Wangda,

We are planning to start the Ozone merge discussion by the end of this month. I 
am hopeful that it will be merged pretty soon after that. 
Please add Ozone to the list of features that are being tracked for Apache 
Hadoop 3.1. 

We would love to release Ozone as an alpha feature in Hadoop 3.1.

Thanks
Anu


On 9/6/17, 2:26 PM, "Arun Suresh"  wrote:

>Thanks for starting this Wangda.
>
>I would also like to add:
>- YARN-5972: Support Pausing/Freezing of opportunistic containers
>
>Cheers
>-Arun
>
>On Wed, Sep 6, 2017 at 1:49 PM, Steve Loughran 
>wrote:
>
>>
>> > On 6 Sep 2017, at 19:13, Wangda Tan  wrote:
>> >
>> > Hi all,
>> >
>> > As we discussed on [1], there were proposals from Steve / Vinod etc to
>> have
>> > a faster cadence of releases and to start thinking of a Hadoop 3.1
>> release
>> > earlier than March 2018 as is currently proposed.
>> >
>> > I think this is a good idea. I'd like to start the process sooner, and
>> > establish timeline etc so that we can be ready when 3.0.0 GA is out. With
>> > this we can also establish faster cadence for future Hadoop 3.x releases.
>> >
>> > To this end, I propose to target Hadoop 3.1.0 for a release by mid Jan
>> > 2018. (About 4.5 months from now and 2.5 months after 3.0-GA, instead of
>> > 6.5 months from now).
>> >
>> > I'd also want to take this opportunity to come up with a more elaborate
>> > release plan to avoid some of the confusion we had with 3.0 beta. General
>> > proposal for the timeline (per this other proposal [2])
>> > - Feature freeze date - all features should be merged by Dec 15, 2017.
>> > - Code freeze date - blockers/critical only, no more improvements and non
>> > blocker/critical bug-fixes: Jan 1, 2018.
>> > - Release date: Jan 15, 2018
>> >
>> > Following is a list of features on my radar which could be candidates
>> for a
>> > 3.1 release:
>> > - YARN-5734, Dynamic scheduler queue configuration. (Owner: Jonathan
>> Hung)
>> > - YARN-5881, Add absolute resource configuration to CapacityScheduler.
>> > (Owner: Sunil)
>> > - YARN-5673, Container-executor rewrite for better security,
>> extensibility
>> > and portability. (Owner Varun Vasudev)
>> > - YARN-6223, GPU isolation. (Owner: Wangda)
>> >
>> > And from email [3] mentioned by Andrew, there’re several other HDFS
>> > features want to be released with 3.1 as well, assuming they fit the
>> > timelines:
>> > - Storage Policy Satisfier
>> > - HDFS tiered storage
>> >
>> > Please let me know if I missed any features targeted to 3.1 per this
>> > timeline.
>>
>>
>> HADOOP-13786 : S3Guard committer, which also adds resilience to failures
>> talking to S3 (we barely have any today),
>>
>> >
>> > And I want to volunteer myself as release manager of 3.1.0 release.
>> Please
>> > let me know if you have any suggestions/concerns.
>>
>> well volunteered :)
>>
>> >
>> > Thanks,
>> > Wangda Tan
>> >
>> > [1] http://markmail.org/message/hwar5f5ap654ck5o?q=
>> > Branch+merges+and+3%2E0%2E0-beta1+scope
>> > [2] http://markmail.org/message/hwar5f5ap654ck5o?q=Branch+
>> > merges+and+3%2E0%2E0-beta1+scope#query:Branch%20merges%
>> > 20and%203.0.0-beta1%20scope+page:1+mid:2hqqkhl2dymcikf5+state:results
>> > [3] http://markmail.org/message/h35obzqrh3ag6dgn?q=Branch+merge
>> > s+and+3%2E0%2E0-beta1+scope


Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Anu Engineer
Could you please attach the PDFs to the JIRA. I think the mailer is stripping 
them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:

>Resending with a broader audience, and reattaching the PDFs.
>
>Daniel
>
>On 9/4/17 9:01 AM, Daniel Templeton wrote:
>> All, in prep for Hadoop 3 beta 1 I've been working on updating the 
>> compatibility guidelines on HADOOP-13714.  I think the initial doc is 
>> more or less complete, so I'd like to open the discussion up to the 
>> broader Hadoop community.
>>
>> In the new guidelines, I have drawn some lines in the sand regarding 
>> compatibility between releases.  In some cases these lines are more 
>> restrictive than the current practices.  The intent with the new 
>> guidelines is not to limit progress by restricting what goes into a 
>> release, but rather to drive release numbering to keep in line with 
>> the reality of the code.
>>
>> Please have a read and provide feedback on the JIRA.  I'm sure there 
>> are more than a couple of areas that could be improved.  If you'd 
>> rather not read markdown from a diff patch, I've attached PDFs of the 
>> two modified docs.
>>
>> Thanks!
>> Daniel
>
>

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: LinkedIn Dynamometer Tool (was About 2.7.4 Release)

2017-07-20 Thread Anu Engineer
Hi Erik,

Looking forward to the release of this tool. Thank you very much for the 
contribution.

Had a couple of questions about how the tool works.

1. Would you be able to provide the traces along with this tool? In other 
words, would I be able to use this out of the box, or do I have to build up 
traces myself? 

2. Could you explain how the “fake out DNs into thinking they are storing data” 
— works? Or I can be patient and read your blog post too.

Thanks
Anu






On 7/20/17, 10:42 AM, "Erik Krogen"  wrote:

>forking off of the 2.7.4 release thread to answer this question about
>Dynamometer
>
>Dynamometer is a tool developed at LinkedIn for scale testing HDFS,
>specifically the NameNode. We have been using it for some time now and have
>recently been making some enhancements to ease of use and reproducibility.
>We hope to post a blog post sometime in the not-too-distant future, and
>also to open source it. I can provide some details here given that we have
>been leveraging it as part of our 2.7.4 release / upgrade process (in
>addition to previous upgrades).
>
>The basic idea is to get full-scale black-box testing of the HDFS NN while
>using significantly less (~10%) hardware than a real cluster of that size
>would require. We use real NN images from our at-scale clusters paired with
>some logic to fake out DNs into thinking they are storing data when they
>are not, allowing us to stuff more DNs onto each machine. Since we use a
>real image, we can replay real traces (collected from audit logs) to
>compare actual production performance vs. performance on this simulated
>cluster (with additional tuning, different version, etc.). We leverage YARN
>to manage setting up this cluster and to replay the traces.
>
>Happy to answer questions.
>
>Erik
>
>On Wed, Jul 19, 2017 at 5:05 PM, Konstantin Shvachko 
>wrote:
>
>> Hi Tianyi,
>>
>> Glad you are interested in Dynamometer. Erik (CC-ed) is actively working
>> on this project right now, I'll let him elaborate.
>> Erik, you should probably respond on Apache dev list, as I think it could
>> be interesting for other people as well, asince we planned to open source
>> it. You can fork the "About 2.7.4 Release" thread with a new subject and
>> give some details about Dynamometer there.
>>
>> Thanks,
>> --Konstantin
>>
>> On Wed, Jul 19, 2017 at 1:40 AM, 何天一  wrote:
>>
>>> Hi, Shavachko.
>>>
>>> You mentioned an internal tool called Dynamometer to test NameNode
>>> performance earlier in the 2.7.4 release thread.
>>> I wonder if you could share some ideas behind the tool. Or is there a
>>> plan to bring Dynamometer to open source community?
>>>
>>> Thanks.
>>>
>>> BR,
>>> Tianyi
>>>
>>> On Fri, Jul 14, 2017 at 8:45 AM Konstantin Shvachko 
>>> wrote:
>>>
 Hi everybody.

 We have been doing some internal testing of Hadoop 2.7.4. The testing is
 going well.
 Did not find any major issues on our workloads.
 Used an internal tool called Dynamometer to check NameNode performance on
 real cluster traces. Good.
 Overall test cluster performance looks good.
 Some more testing is still going on.

 I plan to build an RC next week. If there are no objection.

 Thanks,
 --Konst

 On Thu, Jun 15, 2017 at 4:42 PM, Konstantin Shvachko <
 shv.had...@gmail.com>
 wrote:

 > Hey guys.
 >
 > An update on 2.7.4 progress.
 > We are down to 4 blockers. There is some work remaining on those.
 > https://issues.apache.org/jira/browse/HDFS-11896?filter=12340814
 > Would be good if people could follow up on review comments.
 >
 > I looked through nightly Jenkins build results for 2.7.4 both on Apache
 > Jenkins and internal.
 > Some test fail intermittently, but there no consistent failures. I
 filed
 > HDFS-11985 to track some of them.
 > https://issues.apache.org/jira/browse/HDFS-11985
 > I do not currently consider these failures as blockers. LMK if some of
 > them are.
 >
 > We started internal testing of branch-2.7 on one of our smallish (100+
 > nodes) test clusters.
 > Will update on the results.
 >
 > There is a plan to enable BigTop for 2.7.4 testing.
 >
 > Akira, Brahma thank you for setting up a wiki page for 2.7.4 release.
 > Thank you everybody for contributing to this effort.
 >
 > Regards,
 > --Konstantin
 >
 >
 > On Tue, May 30, 2017 at 12:08 AM, Akira Ajisaka 
 > wrote:
 >
 >> Sure.
 >> If you want to edit the wiki, please tell me your ASF confluence
 account.
 >>
 >> -Akira
 >>
 >> On 2017/05/30 15:31, Rohith Sharma K S wrote:
 >>
 >>> Couple of more JIRAs need to be back ported for 2.7.4 release. These
 will
 >>> solve RM HA unstability issues.
 >>> https://issues.apache.org/jira/browse/YARN-5333
 >>> 

Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-17 Thread Anu Engineer
Hi Allen, 

https://issues.apache.org/jira/browse/INFRA-13902

That happened with ozone branch too. It was an inadvertent force push. Infra 
has advised us to force push the latest branch if you have it.

Thanks
Anu


On 4/17/17, 7:10 AM, "Allen Wittenauer"  wrote:

>Looks like someone reset HEAD back to Mar 31. 
>
>Sent from my iPad
>
>> On Apr 16, 2017, at 12:08 AM, Apache Jenkins Server 
>>  wrote:
>> 
>> For more details, see 
>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/378/
>> 
>> 
>> 
>> 
>> 
>> -1 overall
>> 
>> 
>> The following subsystems voted -1:
>>docker
>> 
>> 
>> Powered by Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>
>-
>To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>



Re: H9 build slave is bad

2017-03-08 Thread Anu Engineer
Agreed, but I was under the impression that we would kill the container under 
OOM conditions and not the whole base machine.

Thanks
Anu


On 3/8/17, 2:41 PM, "Allen Wittenauer" <a...@effectivemachines.com> wrote:

>
>> On Mar 8, 2017, at 2:21 PM, Anu Engineer <aengin...@hortonworks.com> wrote:
>> 
>> Hi Allen,
>>> Likely something in the HDFS-7240 branch or with this patch that's 
>>> doing Bad Things (tm).
>> 
>> Thanks for bringing this to my attention, But I am surprised that a mvn 
>> command is able to kill a test machine.
>
>   FWIW, it’s pretty trivial tip Linux over under low memory conditions….  



Re: H9 build slave is bad

2017-03-08 Thread Anu Engineer
Hi Allen,
>   Likely something in the HDFS-7240 branch or with this patch that's 
> doing Bad Things (tm).

Thanks for bringing this to my attention, But I am surprised that a mvn command 
is able to kill a test machine.

I have pasted the call stack from the issue that you pointed out to be the root 
cause, can you please help me understand what you think is the root cause?  
If anyone can give me pointers to how to access H9 machine, I would love to 
take a look.

From the console logs, I am not able to see why this run can kill H9 machine 
(Let us assume that test is able to kill the container, but rendering H9 
machine inoperable, I doubt that it is related to the patch).
Let us for a second assume that what you are saying is true, that HDFS-7240 is 
somehow killing these machines, why is this happening only on H9? Is HDFS runs 
happening only on H9? 

Thanks
Anu

Ps. We are able to run this on local machines without any issues, I will try to 
run this inside a Docker container just to make sure that these tests are not 
doing something weird.

Stacks from the Console Log: 

mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-HDFS-7240-patch-1 
-Ptest-patch -Pparallel-tests -P!shelltest -Pnative -Drequire.libwebhdfs 
-Drequire.snappy -Drequire.openssl -Drequire.fuse -Drequire.test.libhadoop 
-Pyarn-ui clean test -fae > 
/testptch/hadoop/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt 
2>&1
FATAL: command execution failed
java.io.IOException: Backing channel 'H9' is disconnected.
at 
hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:191)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:256)
at com.sun.proxy.$Proxy104.isAlive(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:154)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:108)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:65)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
at hudson.model.Build$BuildExecution.build(Build.java:205)
at hudson.model.Build$BuildExecution.doRun(Build.java:162)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1728)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:404)
Caused by: hudson.remoting.Channel$OrderlyShutdown
at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1121)
at hudson.remoting.Channel$1.handle(Channel.java:526)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83)
Caused by: Command close created at
at hudson.remoting.Command.(Command.java:59)
at hudson.remoting.Channel$CloseCommand.(Channel.java:1115)
at hudson.remoting.Channel$CloseCommand.(Channel.java:1113)
at hudson.remoting.Channel.close(Channel.java:1273)
at hudson.remoting.Channel.close(Channel.java:1255)
at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1120)
... 2 more
Build step 'Execute shell' marked build as failure
ERROR: Step ?Archive the artifacts? failed: no workspace for 
PreCommit-HDFS-Build #18591
No JDK named ?jdk-1.8.0? found
[description-setter] Description set: HDFS-11451
ERROR: Step ?Publish JUnit test result report? failed: no workspace for 
PreCommit-HDFS-Build #18591
No JDK named ?jdk-1.8.0? found
Finished: FAILURE
These tests are indeed passing on the local boxes, so 


On 3/8/17, 12:04 PM, "Allen Wittenauer"  wrote:

>
>> On Mar 8, 2017, at 9:34 AM, Sean Busbey  wrote:
>> 
>> Is this HADOOP-13951?
>
>   Almost certainly.  Here's the run that broke it again:
>
>https://builds.apache.org/job/PreCommit-HDFS-Build/18591
>
>   Likely something in the HDFS-7240 branch or with this patch that's 
> doing Bad Things (tm).
>
>
>
>-
>To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>



Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

2017-01-20 Thread Anu Engineer
Hi Andrew, 

Thank you for all the hard work. I am really excited to see us making progress 
towards a 3.0 release.

+1 (Non-Binding)

1. Deployed the downloaded bits on 4 node cluster with 1 Namenode and 3 
datanodes.
2. Verified all normal HDFS operations like create directory, create file , 
delete file etc.
3. Ran Map reduce jobs  - Pi and Wordcount
5. Verified Hadoop version command output is correct.

Thanks
Anu


On 1/20/17, 2:36 PM, "Andrew Wang"  wrote:

>Hi all,
>
>With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
>ready.
>
>3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
>up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
>since alpha1 was released on September 3rd, 2016.
>
>More information about the 3.0.0 release plan can be found here:
>
>https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
>The artifacts can be found here:
>
>http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
>This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
>I ran basic validation with a local pseudo cluster and a Pi job. RAT output
>was clean.
>
>My +1 to start.
>
>Thanks,
>Andrew



[jira] [Resolved] (HADOOP-13935) A command to print JAVA_VERSION used by Hadoop/HDFS

2016-12-23 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HADOOP-13935.
---
Resolution: Implemented

This feature is already implemented via envvars command, which is part of 
HADOOP-12366

> A command to print JAVA_VERSION used by Hadoop/HDFS
> ---
>
> Key: HADOOP-13935
> URL: https://issues.apache.org/jira/browse/HADOOP-13935
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.0.0-alpha2
>Reporter: Anu Engineer
>Assignee: Yiqun Lin
>  Labels: newbie
>
> The java version used by hadoop is controlled by JAVA_HOME variable defined 
> in hadoop_env.sh. We log this information when HDFS starts in the log file. 
> {noformat}
> STARTUP_MSG:   java = 1.8.0_112
> {noformat}
> However, it is quite possible that a user might have many versions of java 
> installed on his/her machine. Generally, users tend to check for the java 
> version via 
> {noformat}
> java -version
> {noformat}
> This just means we are printing out the java version in the current shell 
> path.
> This jira proposes adding a new simple command or an extension to existing 
> hadoop version command where the current java version used by hadoop is also 
> printed out.
> This avoids customer confusion when they are looking at if the java stack is 
> properly configured. For example, checking if JCE is installed correctly.
> This is a very minor change that can be done by modifying hdfs.cmd or hdfs 
> shell script in the /bin directory. Thanks to [~sujit] for bringing this to 
> my attention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13935) A command to print JAVA_VERSION used by Hadoop/HDFS

2016-12-22 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-13935:
-

 Summary: A command to print JAVA_VERSION used by Hadoop/HDFS
 Key: HADOOP-13935
 URL: https://issues.apache.org/jira/browse/HADOOP-13935
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.0.0-alpha2
Reporter: Anu Engineer


The java version used by hadoop is controlled by JAVA_HOME variable defined in 
hadoop_env.sh. We log this information when HDFS starts in the log file. 
{noformat}
STARTUP_MSG:   java = 1.8.0_112
{noformat}

However, it is quite possible that a user might have many versions of java 
installed on his/her machine. Generally, users tend to check for the java 
version via 
{noformat}
java -version
{noformat}
This just means we are printing out the java version in the current shell path.

This jira proposes adding a new simple command or an extension to existing 
hadoop version command where the current java version used by hadoop is also 
printed out.

This avoids customer confusion when they are looking at if the java stack is 
properly configured. For example, checking if JCE is installed correctly.

This is a very minor change that can be done by modifying hdfs.cmd or hdfs 
shell script in the /bin directory. Thanks to [~sujit] for bringing this to my 
attention.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge HADOOP-13341

2016-09-09 Thread Anu Engineer
>  SUBCOMMAND is one of:
>
>
>Clients:
>   cacheadmin   configure the HDFS cache
>   classpathprints the class path needed to get the hadoop jar 
> and the required libraries
>   crypto   configure HDFS encryption zones
>   ...
>
>Daemons:
>   balancer run a cluster balancing utility
>   datanode run a DFS datanode
>   namenode run the DFS name node
>...
>---snip---


Absolutely, that is a great output, very clear and provides a very good user 
experience.

Thanks
Anu


On 9/9/16, 3:06 PM, "Allen Wittenauer" <a...@effectivemachines.com> wrote:

>
>> On Sep 9, 2016, at 2:15 PM, Anu Engineer <aengin...@hortonworks.com> wrote:
>> 
>> +1, Thanks for the effort. It brings in a world of consistency to the hadoop 
>> vars; and as usual reading your bash code was very educative.
>
>   Thanks!
>
>   There's still a handful of HDFS and MAPRED vars that begin with HADOOP, 
> but those should be trivial to knock out after a pattern has been established.
>
>> I had a minor suggestion though. since we have classified the _OPTS to 
>> client and daemon opts, for new people it is hard to know which of these 
>> subcommands are daemon vs. a client command.  Maybe we can add a special 
>> char in the help message to indicate which are daemons or just document it? 
>> Only way I know right now is to look the appropriate script and see if 
>> HADOOP_SUBCMD_SUPPORTDAEMONIZATION is set to true.
>
>
>   That's a great suggestion.  Would it be better if the usage output was 
> more like:
>
>---snip---
>Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
>
>  OPTIONS is none or any of:
>
>--buildpaths   attempt to add class files from build tree
>--config dir   Hadoop config directory
>--daemon (start|status|stop)   operate on a daemon
>--debugturn on shell script debug mode
>--help usage information
>--hostnames list[,of,host,names]   hosts to use in worker mode
>--hosts filename   list of hosts to use in worker mode
>--loglevel level   set the log4j level for this command
>--workers  turn on worker mode
>
>  SUBCOMMAND is one of:
>
>
>Clients:
>   cacheadmin   configure the HDFS cache
>   classpathprints the class path needed to get the hadoop jar 
> and the required libraries
>   crypto   configure HDFS encryption zones
>   ...
>
>Daemons:
>   balancer run a cluster balancing utility
>   datanode run a DFS datanode
>   namenode run the DFS name node
>...
>---snip---
>
>   We do something similar in Apache Yetus and shouldn't be too hard to do 
> in Apache Hadoop. We couldn't read SUPPORTDAEMONIZATION to place things, but 
> as long as people put their new commands in the correct section in 
> hadoop_usage, it should work.
>
>


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: [VOTE] Merge HADOOP-13341

2016-09-09 Thread Anu Engineer
+1, Thanks for the effort. It brings in a world of consistency to the hadoop 
vars; and as usual reading your bash code was very educative.

I had a minor suggestion though. since we have classified the _OPTS to client 
and daemon opts, for new people it is hard to know which of these subcommands 
are daemon vs. a client command.  Maybe we can add a special char in the help 
message to indicate which are daemons or just document it? Only way I know 
right now is to look the appropriate script and see if 
HADOOP_SUBCMD_SUPPORTDAEMONIZATION is set to true.

On 9/7/16, 6:44 AM, "Allen Wittenauer"  wrote:

>
>   I’d like to call for a vote to run for 5 days (ending  Mon 12, 2016 at 
> 7AM PT) to merge the HADOOP-13341 feature branch into trunk. This branch was 
> developed exclusively by me.  As usual with large shell script changes, it's 
> been broken up into several smaller commits to make it easier to read.  The 
> core of the functionality is almost entirely in hadoop-functions.sh with the 
> majority of the rest of the new additions either being documentation or test 
> code. In addition, large swaths of code is removed from the hadoop, hdfs, 
> mapred, and yarn executables.
>
>   Here's a quick summary:
>
>* makes the rules around _OPTS consistent across all the projects
>* makes it possible to provide custom _OPTS for every hadoop, hdfs, mapred, 
>and yarn subcommand
>* with the exception of deprecations, removes all of the custom daemon _OPTS 
>handling sprinkled around the hadoop, hdfs, mapred, and yarn subcommands
>* removes the custom handling handling of HADOOP_CLIENT_OPTS and makes it 
>consistent for non-daemon subcommands
>* makes the _USER blocker consistent with _OPTS as well as providing better 
>documentation around this feature's existence.  Note that this is an 
>incompatible change against -alpha1.
>* by consolidating all of this code, makes it possible to finally fix a good 
>chunk of the "directory name containing spaces blows up the bash code" 
>problems that's been around since the beginning of the project
>
>   Thanks!
>
>
>-
>To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>



[jira] [Created] (HADOOP-13352) Make X-FRAME-OPTIONS configurable in HttpServer2

2016-07-07 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-13352:
-

 Summary: Make X-FRAME-OPTIONS configurable in HttpServer2
 Key: HADOOP-13352
 URL: https://issues.apache.org/jira/browse/HADOOP-13352
 Project: Hadoop Common
  Issue Type: Bug
  Components: net, security
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.9.0


In HADOOP-12964 we introduced support for X-FRAME-OPTIONS in HttpServer2. This 
JIRA makes it configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Increased use of feature branches

2016-06-10 Thread Anu Engineer
I actively work on two branches (Diskbalancer and ozone) and I agree with most 
of what Sangjin said. 
There is an overhead in working with branches, there are both technical costs 
and administrative issues 
which discourages developers from using branches.

I think the biggest issue with branch based development is that fact that other 
developers do not use a branch.
If a small feature appears as a series of commits to “”datanode.java””, the 
branch based developer ends up rebasing 
and paying this price of rebasing many times. If everyone followed a model of 
branch + Pull request, other branches
would not have to deal with continues rebasing to trunk commits. If we are 
moving to a branch based 
development, we should probably move to that model for most development to 
avoid this tax on people who
 actually end up working in the branches.

I do have a question in my mind though: What is being proposed is that we move 
active development to branches 
if the feature is small or incomplete, however keep the trunk open for 
check-ins. One of the biggest reason why we 
check-in into trunk and not to branch-2 is because it is a change that will 
break backward compatibility. So do we 
have an expectation of backward compatibility thru the 3.0-alpha series (I 
personally vote No, since 3.0 is experimental 
at this stage), but if we decide to support some sort of backward-compact then 
willy-nilly committing to trunk 
and still maintaining the expectation we can release Alphas from 3.0 does not 
look possible.

And then comes the question, once 3.0 becomes official, where do we check-in a 
change,  if that would break something? 
so this will lead us back to trunk being the unstable – 3.0 being the new 
“branch-2”.

One more point: If we are moving to use a branch always – then we are looking 
at a model similar to using a git + pull 
request model. If that is so would it make sense to modify the rules to make 
these branches easier to merge?
Say for example, if all commits in a branch has followed review and checking 
policy – just like trunk and commits 
have been made only after a sign off from a committer, would it be possible to 
merge with a 3-day voting period 
instead of 7, or treat it just like today’s commit to trunk – but with 2 people 
signing-off? 

What I am suggesting is reducing the administrative overheads of using a branch 
to encourage use of branching.  
Right now it feels like Apache’s process encourages committing directly to 
trunk than a branch

Thanks
Anu


On 6/10/16, 10:50 AM, "sjl...@gmail.com on behalf of Sangjin Lee" 
 wrote:

>Having worked on a major feature in a feature branch, I have some thoughts
>and observations on feature branch development.
>
>IMO feature branch development v. direct commits to trunk in piecemeal is
>really a choice of *granularity*. Do we want a series of fine-grained state
>changes on trunk or fewer coarse-grained chunks of commits on trunk?
>
>This makes me favor a branch-based development model for any "decent-sized"
>features (we'll need to define "decent-sized" of course). Once you have
>coarse-grained changes, it's easier to reason about what made what release
>and in what state. As importantly, it makes it easier to back out a
>complete feature fairly easily if that becomes necessary. My totally
>unscientific suggestion may be if a feature takes more than dozen commits
>and longer than a month, we should probably have a bias towards a feature
>branch.
>
>Branch-based development also makes you go faster if your feature is
>larger. I wouldn't do it the other way for timeline service v.2 for example.
>
>That said, feature branches don't come for free. Now the onus is on the
>feature developer to constantly rebase with the trunk to keep it reasonably
>integrated with the trunk. More logistics is involved for the feature
>developer. Another big question is, when a feature branch gets big and it's
>time to merge, would it get as scrutinized as a series of individual
>commits? Since the size of merge can be big, you kind of have to rely on
>those feature committers and those who help them.
>
>In terms of integrating/stabilizing, I don't think branch development
>necessarily makes it harder. It is again granularity. In case of direct
>commits on trunk, you do a lot more fine-grained integrations. In case of
>branch development, you do far fewer coarse-grained integrations via
>rebasing. If more people are doing branch-based development, it makes
>rebasing easier to manage too.
>
>Going back to the related topic of where to release (trunk v. branch-X), I
>think that is more of a proxy of the real question of "how do we maintain
>quality and stability of the trunk?". Even if we release from the trunk, if
>our bar for merging to trunk is low, the quality will not improve
>automatically. So I think we ought to tackle the quality question first.
>
>My 2 cents.
>
>
>On Fri, Jun 10, 2016 at 8:57 AM, Zhe Zhang 

Re: Pre-commit Docker image build failures on H2?

2016-06-08 Thread Anu Engineer
Sorry did not know about the network. Is it possible to have Java pre-installed 
in the docker image?
 
Are downloading only Java from the net, because all of the cases are Oracle 
Java install failure with error code 100.

If it was a case of network failure, the coincidence is a bit too much.

Thanks
Anu


On 6/8/16, 9:58 AM, "Allen Wittenauer" <allenwittena...@yahoo.com.INVALID> 
wrote:

>
>   You guys know that the build machines in the Yahoo data center 
> temporarily lose network all the time, right?  It's been happening for months 
> now...
>
>
>
>> On Jun 8, 2016, at 9:53 AM, Anu Engineer <aengin...@hortonworks.com> wrote:
>> 
>> Another instance of the same failure.
>> 
>> https://builds.apache.org/job/PreCommit-HADOOP-Build/9690/console
>> 
>> I am going to open a JIRA so that we can track this issue there. This is on 
>> H1 so I don’t think it is machine specific.
>> 
>> Thanks
>> Anu
>> 
>> 
>> On 6/7/16, 5:30 PM, "Anu Engineer" <aengin...@hortonworks.com> wrote:
>> 
>>> Hi Chris,
>>> Thanks for bringing this up. I just ran into the same issue.
>>> 
>>> https://builds.apache.org/job/PreCommit-HDFS-Build/15700/console
>>> 
>>> But in my case it seems like a different host.  “Building remotely on H0”.
>>> 
>>> Thanks
>>> Anu
>>> 
>>> 
>>> On 6/6/16, 3:44 PM, "Chris Nauroth" <cnaur...@hortonworks.com> wrote:
>>> 
>>>> I'm curious if anyone has noticed issues with pre-commit failing during 
>>>> the Docker image build on Jenkins node H2.  Here are a few examples.
>>>> 
>>>> https://builds.apache.org/job/PreCommit-HADOOP-Build/9661/console
>>>> 
>>>> https://builds.apache.org/job/PreCommit-HADOOP-Build/9662/console
>>>> 
>>>> https://builds.apache.org/job/PreCommit-HADOOP-Build/9670/console
>>>> 
>>>> These are all test runs for the same patch, but the patch just removes 5 
>>>> lines of Java code, so I don't expect the particular patch could cause a 
>>>> failure like this.  I noticed that they all ran on H2.  It seems to be a 
>>>> problem installing oracle-java8-installer:
>>>> 
>>>> 
>>>> WARNING: The following packages cannot be authenticated!
>>>> oracle-java8-installer
>>>> ?[91mE: There are problems and -y was used without --force-yes
>>>> ?[0mThe command '/bin/sh -c apt-get -q install --no-install-recommends -y 
>>>> oracle-java8-installer' returned a non-zero code: 100
>>>> 
>>>> --Chris Nauroth
>>> 
>> 
>
>



Re: Pre-commit Docker image build failures on H2?

2016-06-08 Thread Anu Engineer
Another instance of the same failure.

https://builds.apache.org/job/PreCommit-HADOOP-Build/9690/console

I am going to open a JIRA so that we can track this issue there. This is on H1 
so I don’t think it is machine specific.

Thanks
Anu


On 6/7/16, 5:30 PM, "Anu Engineer" <aengin...@hortonworks.com> wrote:

>Hi Chris,
>Thanks for bringing this up. I just ran into the same issue.
>
>https://builds.apache.org/job/PreCommit-HDFS-Build/15700/console
>
>But in my case it seems like a different host.  “Building remotely on H0”.
>
>Thanks
>Anu
>
>
>On 6/6/16, 3:44 PM, "Chris Nauroth" <cnaur...@hortonworks.com> wrote:
>
>>I'm curious if anyone has noticed issues with pre-commit failing during the 
>>Docker image build on Jenkins node H2.  Here are a few examples.
>>
>>https://builds.apache.org/job/PreCommit-HADOOP-Build/9661/console
>>
>>https://builds.apache.org/job/PreCommit-HADOOP-Build/9662/console
>>
>>https://builds.apache.org/job/PreCommit-HADOOP-Build/9670/console
>>
>>These are all test runs for the same patch, but the patch just removes 5 
>>lines of Java code, so I don't expect the particular patch could cause a 
>>failure like this.  I noticed that they all ran on H2.  It seems to be a 
>>problem installing oracle-java8-installer:
>>
>>
>>WARNING: The following packages cannot be authenticated!
>>  oracle-java8-installer
>>?[91mE: There are problems and -y was used without --force-yes
>>?[0mThe command '/bin/sh -c apt-get -q install --no-install-recommends -y 
>>oracle-java8-installer' returned a non-zero code: 100
>>
>>--Chris Nauroth
>



Re: New Contributor

2016-05-05 Thread Anu Engineer
Hi Vikram,

Welcome. Following query looks for JIRAs with newbie tag. They are usually 
easier JIRAs to look at and understand.
The following query restricts the search to HDFS, YARN, HADOOP COMMON and 
HBASE. Feel free to look at other sub-projects too if you like.

https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20HADOOP%2C%20YARN%2C%20HBASE)%20AND%20text%20~%20newbie


Thanks
Anu





On 5/5/16, 3:47 AM, "Vikram Singh"  wrote:

>Thanks Chris :-)
>
>I am done with code setup. Code base is huge and its difficult to directly
>jump into and understand the flow.
>I am trying to find some bugs in JIRA on which I can work on and understand
>code and development process.
>Could someone please help me to find some good starter issues on which I
>can work and see the code working.
>
>regards
>Vikram
>
>On Tue, May 3, 2016 at 10:16 PM, Chris Nauroth 
>wrote:
>
>> Hello Vikram,
>>
>> Thank you for deciding to contribute, and welcome aboard!  This wiki page
>> has the information you need on getting started.
>>
>> https://wiki.apache.org/hadoop/HowToContribute
>>
>>
>> --Chris Nauroth
>>
>>
>>
>>
>> On 5/3/16, 3:42 AM, "Vikram Singh"  wrote:
>>
>> >Hi,
>> >
>> >I would like to contribute in Hadoop development.
>> >
>> >Looking for help to get started.
>> >
>> >regards
>> >Vikram
>>
>>


[jira] [Created] (HADOOP-13102) Update GroupsMapping documentation to reflect the new changes

2016-05-05 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-13102:
-

 Summary: Update GroupsMapping documentation to reflect the new 
changes
 Key: HADOOP-13102
 URL: https://issues.apache.org/jira/browse/HADOOP-13102
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.8.0
Reporter: Anu Engineer
 Fix For: 2.8.0


We need to update this section in the groupsMapping.md
{noformat}
Line 84:

The implementation does not attempt to resolve group hierarchies. Therefore, a 
user must be an explicit member of a group object
in order to be considered a member.
{noformat} 

With changes in Hadoop-12291 this is no longer true since we will have the 
ability to walk the group hierarchies.

We also should modify this line 
{noformat}
Line :  81
It is possible to set a maximum time limit when searching and awaiting a result.
Set `hadoop.security.group.mapping.ldap.directory.search.timeout` to 0 if 
infinite wait period is desired. Default is 10,000 milliseconds (10 seconds).
{noformat}

we might want to document how the new settings affects the timeoout.

and also add the new settings into this doc.
{noformat}
 hadoop.security.group.mapping.ldap.search.group.hierarchy.levels
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Design docs

2016-04-14 Thread Anu Engineer
An overview of HDFS is here in this paper — 
http://pages.cs.wisc.edu/~akella/CS838/F15/838-CloudPapers/hdfs.pdf , a bit 
dated but still the best reference to HDFS.

If you looking for a book that covers the breadth of Hadoop eco-system, I would 
recommend Hadoop: The Definitive Guide  by White.

If you are looking for something more deeper, many JIRAs have design documents 
attached to them and they usually have very deep dive on that specific area. 

And the caveat : Code changes too frequently for any of the above mentioned 
resources to absolutely correct, so you might want to read the source too.

Thanks
Anu




On 4/14/16, 9:12 AM, "gor joseph"  wrote:

>Good Morning , 
>
>
>I have been reading the project docs and so far I have read mostly user docs , 
>nothing to help developer understand the architecture of the project
>
>any advice ? 
>
>thank you 
>
>
>
>
>
>
>Sincerely ,
>Joseph.
>LinkedIn : https://fr.linkedin.com/in/josephgor
>Mobile : +33 630733572
>Skype :gor.jos...@outlook.com
>E-mail :gor.jos...@outlook.com


Re: Introduction

2015-12-01 Thread Anu Engineer
Hi Luis, 

Welcome to Hadoop. We have a very nascent process for forking and for pull 
requests using Github. It is still evolving and the process is slightly 
different from standard github flow.

The easiest way to contribute your work is to follow the process outlined in 
this document.

https://wiki.apache.org/hadoop/HowToContribute


If you really like to contribute using Github flow, here is the process.

https://wiki.apache.org/hadoop/GithubIntegration


Thanks
Anu




On 12/1/15, 2:50 PM, "Lluis Martinez"  wrote:

>Hi devs
>
>My name is Lluis, I recently started the Coursera Hadoop course and thought
>it would be a good idea to collaborate and see the inner workings of this
>platform.
>I've been doing Java development the last 10 years, I consider myself a
>backend more than frontend guy. I'm familiar with JIRA, Maven, Git and
>Eclipse.
>Doubt: when reading the documentationI've seen no mention of GitHub and
>pull requests. Is it ok to fork the repository and issue pull requests?
>
>Best regards


Re: Need for force-push on feature branches

2015-11-10 Thread Anu Engineer
I ran into the same issue and  filed an INFRA jira too. 

https://issues.apache.org/jira/browse/INFRA-10720


So +1 for having the git control back

Thanks
Anu





On 11/10/15, 2:45 PM, "Steve Loughran"  wrote:

>
>> On 10 Nov 2015, at 22:07, Karthik Kambatla  wrote:
>> 
>> Hi folks,
>> 
>> Recently, Infra disabled force-pushes (and non-fast-forward pushes) to all
>> branches to avoid accidental overwrites or deletions.
>> 
>> I propose we reach out to Infra and ask for an exemption since our workflow
>> for feature branches involves deletions and force-pushes.
>> 
>
>I asked them for this exemption for all branches called feature/* earlier 
>today; its consistent with the git flow branch naming. 
>
>if all feature branches go in under there, people working on them are free to 
>rebase as they choose.
>
>> We should likely wait for a day or so to hear any concerns against this
>> request. Also, can someone volunteer following up on this? I am going away
>> on vacation shortly, and will have limited access to internet/email.
>> 
>
>
>


Re: [VOTE] Release Apache Hadoop 2.6.2

2015-10-26 Thread Anu Engineer
+1 ( Non-binding)

- Downloaded 2.6.1 — created a cluster with namenode and a bunch of data nodes.
- verified that Rolling upgrade and Rollback options work correctly in moving 
from 2.61 to 2.6.2

—Anu




On 10/22/15, 2:14 PM, "sjl...@gmail.com on behalf of Sangjin Lee" 
 wrote:

>Hi all,
>
>I have created a release candidate (RC0) for Hadoop 2.6.2.
>
>The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/
>
>The RC tag in git is: release-2.6.2-RC0
>
>The list of JIRAs committed for 2.6.2:
>https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2
>
>The maven artifacts are staged at
>https://repository.apache.org/content/repositories/orgapachehadoop-1022/
>
>Please try out the release candidate and vote. The vote will run for 5 days.
>
>Thanks,
>Sangjin


[jira] [Created] (HADOOP-12325) RPC Metrics : Add the ability track and log slow RPCs

2015-08-14 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-12325:
-

 Summary: RPC Metrics : Add the ability track and log slow RPCs
 Key: HADOOP-12325
 URL: https://issues.apache.org/jira/browse/HADOOP-12325
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc, metrics
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer


This JIRA proposes to add a counter called RpcSlowCalls and also a 
configuration setting that allows users to log really slow RPCs.  Slow RPCs are 
RPCs that fall at 99th percentile. This is useful to troubleshoot why certain 
services like name node freezes under heavy load.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12322) typos in rpcmetrics.java

2015-08-13 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-12322:
-

 Summary: typos in rpcmetrics.java
 Key: HADOOP-12322
 URL: https://issues.apache.org/jira/browse/HADOOP-12322
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
Priority: Trivial


typos in RpcMetrics.java

Processsing -- Processing
sucesses -  successes
JobTrackerInstrumenation - JobTrackerInstrumentation

these are all part of description of the metric or in comments. So should have 
no impact on backward compact.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11957) if an IOException error is thrown in DomainSocket.close we go into infinite loop.

2015-05-11 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-11957:
-

 Summary: if an IOException error is thrown in DomainSocket.close 
we go into infinite loop.
 Key: HADOOP-11957
 URL: https://issues.apache.org/jira/browse/HADOOP-11957
 Project: Hadoop Common
  Issue Type: Bug
  Components: net
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer


if an IOException error is thrown in DomainSocket.close we go into infinite 
loop.

Issue : If the shutdown0(fd) call throws an IOException we break out of the if 
shutdown call but will continue to loop in the while loop infinitely since we 
have no way of decrementing the counter. Please scroll down and see the comment 
marked with Bug Bug to see where the issue is.

{code:title=DomainSocket.java}
  @Override
  public void close() throws IOException {
// Set the closed bit on this DomainSocket
int count = 0;
try {
  count = refCount.setClosed();
} catch (ClosedChannelException e) {
  // Someone else already closed the DomainSocket.
  return;
}
// Wait for all references to go away
boolean didShutdown = false;
boolean interrupted = false;
while (count  0) {
  if (!didShutdown) {
try {
  // Calling shutdown on the socket will interrupt blocking system
  // calls like accept, write, and read that are going on in a
  // different thread.
  shutdown0(fd);
} catch (IOException e) {
  LOG.error(shutdown error: , e);
}
didShutdown = true; 

// *BUG BUG* -- Here the code will never exit the loop
// if the count is greater then 0. we need to break out
// of the while loop in case of IOException Error

  }
  try {
Thread.sleep(10);
  } catch (InterruptedException e) {
interrupted = true;
  }
  count = refCount.getReferenceCount();
}

// At this point, nobody has a reference to the file descriptor, 
// and nobody will be able to get one in the future either.
// We now call close(2) on the file descriptor.
// After this point, the file descriptor number will be reused by 
// something else.  Although this DomainSocket object continues to hold 
// the old file descriptor number (it's a final field), we never use it 
// again because this DomainSocket is closed.
close0(fd);
if (interrupted) {
  Thread.currentThread().interrupt();
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11861) test-patch script always fails when option --build-native=false is specified

2015-04-21 Thread Anu Engineer (JIRA)
Anu Engineer created HADOOP-11861:
-

 Summary: test-patch script always fails when option  
--build-native=false is specified
 Key: HADOOP-11861
 URL: https://issues.apache.org/jira/browse/HADOOP-11861
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 2.8.0
Reporter: Anu Engineer


if you specify --build-native=false  like 
{code}
./dev-support/test-patch.sh  --build-native=false 
~/workspaces/patches/hdfs-8211.001.patch 
{code}

mvn fails with invalid lifecycle error. 

Here are the steps to repro :

1) run any patch with --buid-native=false option 

2) Open up  /tmp/hadoop-test-patch/tmp-patch/patchJavacWarnings.txt to see 
the failure reason.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11832) spnego authentication logs only log in debug mode so its difficult to debug auth isues

2015-04-20 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HADOOP-11832.
---
Resolution: Won't Fix

 spnego authentication logs only log in debug mode so its difficult to debug 
 auth isues
 --

 Key: HADOOP-11832
 URL: https://issues.apache.org/jira/browse/HADOOP-11832
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.5.2
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hadoop-11832.001.patch


 The following logs should be at info level so that auth failures can be 
 debugged more easily.
 {code}
 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 filter org.apache.hadoop.hdfs.web.AuthFilter
 2015-03-18 06:49:40,397 DEBUG server.AuthenticationFilter 
 (AuthenticationFilter.java:doFilter(505)) - Request 
 [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTO
 KENuser.name=hrt_qa] triggering authentication
 2015-03-18 06:49:40,397 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - 
 RESPONSE /webhdfs/v1/  401
 2015-03-18 06:49:40,549 DEBUG BlockStateChange 
 (BlockManager.java:computeReplicationWorkForBlocks(1499)) - BLOCK* 
 neededReplications = 1357 pendingReplications = 0
 2015-03-18 06:49:40,634 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - EOF
 2015-03-18 06:49:40,639 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - REQUEST 
 /webhdfs/v1/ on org.mortbay.jetty.HttpConnection@33c174b5
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - 
 sessionManager=org.mortbay.jetty.servlet.HashSessionManager@a072d8c
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - 
 session=null
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - 
 servlet=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - 
 chain=NoCacheFilter-NoCacheFilter-safety-org.apache.hadoop.hdfs.web.AuthFilter-com.sun.jersey.spi.container.servlet.ServletContainer-1953517520
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - servlet 
 holder=com.sun.jersey.spi.container.servlet.ServletContainer-1953517520
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 filter NoCacheFilter
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 filter NoCacheFilter
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 filter safety
 2015-03-18 06:49:40,640 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 filter org.apache.hadoop.hdfs.web.AuthFilter
 2015-03-18 06:49:40,640 DEBUG server.AuthenticationFilter 
 (AuthenticationFilter.java:doFilter(505)) - Request 
 [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa]
  triggering authentication
 2015-03-18 06:49:40,642 DEBUG server.AuthenticationFilter 
 (AuthenticationFilter.java:doFilter(517)) - Request 
 [http://os-hdp-2-2-r6-1426637581-sec-falcon-1-3.novalocal:50070/webhdfs/v1/?op=GETDELEGATIONTOKENuser.name=hrt_qa]
  user [hrt_qa] authenticated
 2015-03-18 06:49:40,642 DEBUG mortbay.log (Slf4jLog.java:debug(40)) - call 
 servlet com.sun.jersey.spi.container.servlet.ServletContainer-1953517520
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)