Re: Official Apache Slack Channel for Hadoop projects
Thanks Wei-Chiu. Just join both hdfs and yarn channel. Yes, there is a yarn channel. There are only 3 members in the yarn channel. Best, Yufei `This is not a contribution` On Fri, Oct 11, 2019 at 4:35 PM Wei-Chiu Chuang wrote: > Hi Hadoop devs, > > In case you don't know, there is an official ASF slack, and there's a HDFS > channel in it. This is the slack workplace managed by Apache Infra. > > Please see this wiki to get invite: > https://cwiki.apache.org/confluence/display/INFRA/Slack+Guest+Invites or > DM > me to get an invite. > > Once you get access to the ASF workplace, search for #hdfs channel. There > is also an #ozone channel, #hadoop, #submarine-dev, #submarine-user. I > don't see a #yarn channel, but I can create one (not sure if who is > eligible for creating channels, PMC or committers or any one?) > > We will not use Slack channel to vote on project decisions/vote, but it > might be an easier way to find me. Right now the channels are quite dry. > Let's see if we can revive them. > > Weichiu >
Re: [DISCUSS] A unified and open Hadoop community sync up schedule?
+1 for this idea. Thanks Wangda for bringing this up. Some comments to share: - Agenda needed to be posted ahead of meeting and welcome any interested party to contribute to topics. - We should encourage more people to attend. That's whole point of the meeting. - Hopefully, this can mitigate the situation that some patches are waiting for review for ever, which turns away new contributors. - 30m per session sounds a little bit short, we can try it out and see if extension is needed. Best, Yufei `This is not a contribution` On Fri, Jun 7, 2019 at 4:39 PM Wangda Tan wrote: > Hi Hadoop-devs, > > Previous we have regular YARN community sync up (1 hr, biweekly, but not > open to public). Recently because of changes in our schedules, Less folks > showed up in the sync up for the last several months. > > I saw the K8s community did a pretty good job to run their sig meetings, > there's regular meetings for different topics, notes, agenda, etc. Such as > > https://docs.google.com/document/d/13mwye7nvrmV11q9_Eg77z-1w3X7Q1GTbslpml4J7F3A/edit > > > For Hadoop community, there are less such regular meetings open to the > public except for Ozone project and offline meetups or Bird-of-Features in > Hadoop/DataWorks Summit. Recently we have a few folks joined DataWorks > Summit at Washington DC and Barcelona, and lots (50+) of folks join the > Ozone/Hadoop/YARN BoF, ask (good) questions and roadmaps. I think it is > important to open such conversations to the public and let more > folk/companies join. > > Discussed a small group of community members and wrote a short proposal > about the form, time and topic of the community sync up, thanks for > everybody who have contributed to the proposal! Please feel free to add > your thoughts to the Proposal Google doc > < > https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit# > > > . > > Especially for the following parts: > - If you have interests to run any of the community sync-ups, please put > your name to the table inside the proposal. We need more volunteers to help > run the sync-ups in different timezones. > - Please add suggestions to the time, frequency and themes and feel free to > share your thoughts if we should do sync ups for other topics which are not > covered by the proposal. > > Link to the Proposal Google doc > < > https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit# > > > > Thanks, > Wangda Tan >
Re: [VOTE] Release Apache Hadoop 3.0.2 (RC0)
Thanks Lei for working on this! +1 (non-binding) - Downloaded the binary tarball and verified the checksum. - Started a pseudo cluster inside one docker container - Run Resource Manager with Fair Scheduler - Verified distributed shell - Verified mapreduce pi job - Sanity checked RM WebUI Best, Yufei On Fri, Apr 6, 2018 at 11:16 AM, Lei Xuwrote: > Hi, All > > I've created release candidate RC-0 for Apache Hadoop 3.0.2. > > Please note: this is an amendment for Apache Hadoop 3.0.1 release to > fix shaded jars in apache maven repository. The codebase of 3.0.2 > release is the same as 3.0.1. New bug fixes will be included in > Apache Hadoop 3.0.3 instead. > > The release page is: > https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0+Release > > New RC is available at: http://home.apache.org/~lei/hadoop-3.0.2-RC0/ > > The git tag is release-3.0.2-RC0, and the latest commit is > 5c141f7c0f24c12cb8704a6ccc1ff8ec991f41ee > > The maven artifacts are available at > https://repository.apache.org/content/repositories/orgapachehadoop-1096/ > > Please try the release, especially, *verify the maven artifacts*, and vote. > > The vote will run 5 days, ending 4/11/2018. > > Thanks for everyone who helped to spot the error and proposed fixes! > > - > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org > >
Re: [VOTE] Release Apache Hadoop 3.1.0 (RC1)
Thanks Wangda for working on this! +1 (non-binding) - Downloaded the binary tarball and verified the checksum. - Started a pseudo cluster inside one docker container - Run Resource Manager with Fair Scheduler - Verified distributed shell - Verified mapreduce pi job - Sanity checked RM WebUI Best, Yufei On Thu, Mar 29, 2018 at 9:15 PM, Wangda Tanwrote: > Hi folks, > > Thanks to the many who helped with this release since Dec 2017 [1]. We've > created RC1 for Apache Hadoop 3.1.0. The artifacts are available here: > > http://people.apache.org/~wangda/hadoop-3.1.0-RC1 > > The RC tag in git is release-3.1.0-RC1. Last git commit SHA is > 16b70619a24cdcf5d3b0fcf4b58ca77238ccbe6d > > The maven artifacts are available via repository.apache.org at > https://repository.apache.org/content/repositories/orgapachehadoop-1090/ > This vote will run 5 days, ending on Apr 3 at 11:59 pm Pacific. > > 3.1.0 contains 766 [2] fixed JIRA issues since 3.0.0. Notable additions > include the first class GPU/FPGA support on YARN, Native services, Support > rich placement constraints in YARN, S3-related enhancements, allow HDFS > block replicas to be provided by an external storage system, etc. > > For 3.1.0 RC0 vote discussion, please see [3]. > > We’d like to use this as a starting release for 3.1.x [1], depending on how > it goes, get it stabilized and potentially use a 3.1.1 in several weeks as > the stable release. > > We have done testing with a pseudo cluster: > - Ran distributed job. > - GPU scheduling/isolation. > - Placement constraints (intra-application anti-affinity) by using > distributed shell. > > My +1 to start. > > Best, > Wangda/Vinod > > [1] > https://lists.apache.org/thread.html/b3fb3b6da8b6357a68513a6dfd104b > c9e19e559aedc5ebedb4ca08c8@%3Cyarn-dev.hadoop.apache.org%3E > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.0) > AND fixVersion not in (3.0.0, 3.0.0-beta1) AND status = Resolved ORDER BY > fixVersion ASC > [3] > https://lists.apache.org/thread.html/b3a7dc075b7329fd660f65b48237d7 > 2d4061f26f83547e41d0983ea6@%3Cyarn-dev.hadoop.apache.org%3E >
Re: [VOTE] Release Apache Hadoop 3.0.1 (RC1)
Thanks Eddy! +1 (non-binding) - Downloaded the hadoop-3.0.1.tar.gz from http://home.apache.org/~lei/hadoop-3.0.1-RC1/ - Started a pseudo cluster inside one docker container - Verified distributed shell - Verified mapreduce pi job - Sanity check RM WebUI Best, Yufei On Tue, Mar 20, 2018 at 9:32 AM, Eric Paynewrote: > Thanks for working on this release! > +1 (binding) > I tested the following: > - yarn distributed shell job > > - yarn streaming job > > - inter-queue preemption > > - compared behavior of fair and fifo ordering policy > > - both userlimit_first mode and priority_first mode of intra-queue > preemption > > Eric Payne > > > > On Saturday, March 17, 2018, 11:11:32 PM CDT, Lei Xu > wrote: > > Hi, all > > I've created release candidate RC-1 for Apache Hadoop 3.0.1 > > Apache Hadoop 3.0.1 will be the first bug fix release for Apache > Hadoop 3.0 release. It includes 49 bug fixes and security fixes, which > include 12 > blockers and 17 are critical. > > Please note: > * HDFS-12990. Change default NameNode RPC port back to 8020. It makes > incompatible changes to Hadoop 3.0.0. After 3.0.1 releases, Apache > Hadoop 3.0.0 will be deprecated due to this change. > > The release page is: > https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0+Release > > New RC is available at: http://home.apache.org/~lei/hadoop-3.0.1-RC1/ > > The git tag is release-3.0.1-RC1, and the latest commit is > 496dc57cc2e4f4da117f7a8e3840aaeac0c1d2d0 > > The maven artifacts are available at: > https://repository.apache.org/content/repositories/orgapachehadoop-1081/ > > Please try the release and vote; the vote will run for the usual 5 > days, ending on 3/22/2017 6pm PST time. > > Thanks! > > - > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] official docker image(s) for hadoop
It would be very helpful for testing the RC. To vote a RC, committers and PMCs usually spend lots of time to compile, deploy the RC, do several sanity tests, then +1 for the RC. The docker image potentially saves the compilation and deployment time, and people can do more tests. Best, Yufei On Wed, Sep 13, 2017 at 11:19 AM, Wangda Tanwrote: > +1 to add Hadoop docker image for easier testing / prototyping, it gonna be > super helpful! > > Thanks, > Wangda > > On Wed, Sep 13, 2017 at 10:48 AM, Miklos Szegedi < > miklos.szeg...@cloudera.com> wrote: > > > Marton, thank you for working on this. I think Official Docker images for > > Hadoop would be very useful for a lot of reasons. I think that it is > better > > to have a coordinated effort with production ready base images with > > dependent images for prototyping. Does anyone else have an opinion about > > this? > > > > Thank you, > > Miklos > > > > On Fri, Sep 8, 2017 at 5:45 AM, Marton, Elek wrote: > > > > > > > > TL;DR: I propose to create official hadoop images and upload them to > the > > > dockerhub. > > > > > > GOAL/SCOPE: I would like improve the existing documentation with > > > easy-to-use docker based recipes to start hadoop clusters with various > > > configuration. > > > > > > The images also could be used to test experimental features. For > example > > > ozone could be tested easily with these compose file and configuration: > > > > > > https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6 > > > > > > Or even the configuration could be included in the compose file: > > > > > > https://github.com/elek/hadoop/blob/docker-2.8.0/example/doc > > > ker-compose.yaml > > > > > > I would like to create separated example compose files for federation, > > ha, > > > metrics usage, etc. to make it easier to try out and understand the > > > features. > > > > > > CONTEXT: There is an existing Jira https://issues.apache.org/jira > > > /browse/HADOOP-13397 > > > But it’s about a tool to generate production quality docker images > > > (multiple types, in a flexible way). If no objections, I will create a > > > separated issue to create simplified docker images for rapid > prototyping > > > and investigating new features. And register the branch to the > dockerhub > > to > > > create the images automatically. > > > > > > MY BACKGROUND: I am working with docker based hadoop/spark clusters > quite > > > a while and run them succesfully in different environments (kubernetes, > > > docker-swarm, nomad-based scheduling, etc.) My work is available from > > here: > > > https://github.com/flokkr but they could handle more complex use cases > > > (eg. instrumenting java processes with btrace, or read/reload > > configuration > > > from consul). > > > And IMHO in the official hadoop documentation it’s better to suggest > to > > > use official apache docker images and not external ones (which could be > > > changed). > > > > > > Please let me know if you have any comments. > > > > > > Marton > > > > > > - > > > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org > > > > > > > > >
[jira] [Created] (HDFS-11143) start.sh doesn't return any error message even namenode is not up.
Yufei Gu created HDFS-11143: --- Summary: start.sh doesn't return any error message even namenode is not up. Key: HDFS-11143 URL: https://issues.apache.org/jira/browse/HDFS-11143 Project: Hadoop HDFS Issue Type: Bug Reporter: Yufei Gu -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-8926) Update the distcp document for new improvements by using snapshot diff report
Yufei Gu created HDFS-8926: -- Summary: Update the distcp document for new improvements by using snapshot diff report Key: HDFS-8926 URL: https://issues.apache.org/jira/browse/HDFS-8926 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, documentation Reporter: Yufei Gu Assignee: Yufei Gu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
Yufei Gu created HDFS-8828: -- Summary: Utilize Snapshot diff report to build copy list in distcp Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yufei Gu Assignee: Yufei Gu Some users reported huge time cost to build file copy list in distcp. (30 hours with 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, the fallback to the default distcp. So it still relies on default distcp to building copy list which will traverse all files under the source dir. This patch will build the copy list based on snapshot diff report. -- This message was sent by Atlassian JIRA (v6.3.4#6332)