Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread zac yuan
+1
Thanks a lot, Wangda

Zac Zhou

Sunil G  于2019年2月2日周六 上午10:51写道:

> +1 . Thanks Wangda.
>
> - Sunil
>
> On Sat, Feb 2, 2019 at 3:54 AM Wangda Tan  wrote:
>
> > Hi all,
> >
> > According to positive feedbacks from the thread [1]
> >
> > This is vote thread to start a new subproject named "hadoop-submarine"
> > which follows the release process already established for ozone.
> >
> > The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
> >
> > Thanks,
> > Wangda Tan
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
> >
>


[jira] [Created] (YARN-9273) Flexing a component of YARN service does not work as documented when using relative number

2019-02-01 Thread Masahiro Tanaka (JIRA)
Masahiro Tanaka created YARN-9273:
-

 Summary: Flexing a component of YARN service does not work as 
documented when using relative number
 Key: YARN-9273
 URL: https://issues.apache.org/jira/browse/YARN-9273
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Masahiro Tanaka
Assignee: Masahiro Tanaka


[The 
documents|https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/yarn-service/QuickStart.html]
 says,
 "Relative changes are also supported for the ${NUMBER_OF_CONTAINERS} in the 
flex command, such as +2 or -2." when you want to flex a component of a YARN 
service.

I expected that {{yarn app -flex sleeper-service -component sleeper +1}} 
increments the number of container, but actually it sets the number of 
container to just one.

I guess ApiServiceClient#actionFlex treats flexing when executing the {{yarn 
app -flex}}, and it just uses {{Long.parseLong}} to convert the argument like 
{{+1}}, which doesn't care relative numbers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Sunil G
+1 . Thanks Wangda.

- Sunil

On Sat, Feb 2, 2019 at 3:54 AM Wangda Tan  wrote:

> Hi all,
>
> According to positive feedbacks from the thread [1]
>
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
>
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
>


Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Naganarasimha Garla
+1

On Sat, 2 Feb 2019, 09:51 Rohith Sharma K S  +1
>
> On Sat, Feb 2, 2019, 3:54 AM Wangda Tan  wrote:
>
> > Hi all,
> >
> > According to positive feedbacks from the thread [1]
> >
> > This is vote thread to start a new subproject named "hadoop-submarine"
> > which follows the release process already established for ozone.
> >
> > The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
> >
> > Thanks,
> > Wangda Tan
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
> >
>


Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Rohith Sharma K S
+1

On Sat, Feb 2, 2019, 3:54 AM Wangda Tan  wrote:

> Hi all,
>
> According to positive feedbacks from the thread [1]
>
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
>
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
>


Re: [DISCUSS] Moving branch-2 to java 8

2019-02-01 Thread Konstantin Shvachko
Just to make sure we are on the same page, as the subject of this thread is
too generic and confusing.
*The proposal is to move branch-2 Jenkins builds such as precommit to run
tests on openJDK-8.*
We do not want to break Java 7 source compatibility. The sources and
releases will still depend on Java 7.
We don't see test failures discussed in HADOOP-15711 when we run them
locally with Oracle Java 7.

Thanks,
--Konst

On Fri, Feb 1, 2019 at 12:44 PM Jonathan Hung  wrote:

> Thanks Vinod and Steve, agreed about java7 compile compatibility. At least
> for now, we should be able to maintain java7 source compatibility and run
> tests on java8. There's a test run here:
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86-jhung/46/
> which calls a java8 specific API, installs both openjdk7/openjdk8 in the
> dockerfile, compiles on both versions, and tests on just java8 (via
>
> --multijdkdirs=/usr/lib/jvm/java-7-openjdk-amd64,/usr/lib/jvm/java-8-openjdk-amd64
> and --multijdktests=compile). If we eventually decide it's too much of a
> pain to maintain java7 source compatibility we can do that at a later
> point.
>
> Also based on discussion with others in the community at the contributors
> meetup this past Wednesday, seems we are generally in favor of testing
> against java8. I'll start a vote soon.
>
> Jonathan Hung
>
>
> On Tue, Jan 29, 2019 at 4:11 AM Steve Loughran 
> wrote:
>
> > branch-2 is the JDK 7 branch, but for a long time I (and presumably
> > others) have relied on jenkins to keep us honest by doing that build and
> > test
> >
> > right now, we can't do that any more, due to jdk7 bugs which will never
> be
> > fixed by oracle, or at least, not in a public release.
> >
> > If we can still do the compile in java 7 language and link to java 7 JDK,
> > then that bit of the release is good -then java 8 can be used for that
> test
> >
> > Ultimately, we're going to be forced onto java 8 just because all our
> > dependencies have moved onto it, and some CVE will force us to move.
> >
> > At which point, I think its time to declare branch-2 dead. It's had a
> > great life, but trying to keep java 7 support alive isn't sustainable.
> Not
> > just in this testing, but
> > cherrypicking patches back gets more and more difficult -branch-3 has
> > moved on in both use of java 8 language, and in the codebase in general.
> >
> > > On 28 Jan 2019, at 20:18, Vinod Kumar Vavilapalli 
> > wrote:
> > >
> > > The community made a decision long time ago that we'd like to keep the
> > compatibility & so tie branch-2 to Java 7, but do Java 8+ only work on
> 3.x.
> > >
> > > I always assumed that most (all?) downstream users build branch-2 on
> JDK
> > 7 only, can anyone confirm? If so, there may be an easier way to address
> > these test issues.
> > >
> > > +Vinod
> > >
> > >> On Jan 28, 2019, at 11:24 AM, Jonathan Hung 
> > wrote:
> > >>
> > >> Hi folks,
> > >>
> > >> Forking a discussion based on HADOOP-15711. To summarize, there are
> > issues
> > >> with branch-2 tests running on java 7 (openjdk) which don't exist on
> > java
> > >> 8. From our testing, the build can pass with openjdk 8.
> > >>
> > >> For branch-3, the work to move the build to use java 8 was done in
> > >> HADOOP-14816 as part of the Dockerfile OS version change. HADOOP-16053
> > was
> > >> filed to backport this OS version change to branch-2 (but without the
> > java
> > >> 7 -> java 8 change). So my proposal is to also make the java 7 ->
> java 8
> > >> version change in branch-2.
> > >>
> > >> As mentioned in HADOOP-15711, the main issue is around source and
> binary
> > >> compatibility. I don't currently have a great answer, but one initial
> > >> thought is to build source/binary against java 7 to ensure
> compatibility
> > >> and run the rest of the build as java 8.
> > >>
> > >> Thoughts?
> > >>
> > >> Jonathan Hung
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> > >
> >
> >
>


Re: [VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Jonathan Hung
+1. Thanks Wangda.

Jonathan Hung


On Fri, Feb 1, 2019 at 2:25 PM Dinesh Chitlangia <
dchitlan...@hortonworks.com> wrote:

> +1 (non binding), thanks Wangda for organizing this.
>
> Regards,
> Dinesh
>
>
>
> On 2/1/19, 5:24 PM, "Wangda Tan"  wrote:
>
> Hi all,
>
> According to positive feedbacks from the thread [1]
>
> This is vote thread to start a new subproject named "hadoop-submarine"
> which follows the release process already established for ozone.
>
> The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.
>
> Thanks,
> Wangda Tan
>
> [1]
>
> https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E
>
>
>


[jira] [Created] (YARN-9272) Backport YARN-7738 for refreshing max allocation for multiple resource types

2019-02-01 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-9272:
---

 Summary: Backport YARN-7738 for refreshing max allocation for 
multiple resource types
 Key: YARN-9272
 URL: https://issues.apache.org/jira/browse/YARN-9272
 Project: Hadoop YARN
  Issue Type: Sub-task
 Environment: Backport to YARN-8200 feature branch (for branch-2).
Reporter: Jonathan Hung
Assignee: Jonathan Hung






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9271) Backport YARN-6927 for resource type support in MapReduce

2019-02-01 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-9271:
---

 Summary: Backport YARN-6927 for resource type support in MapReduce
 Key: YARN-9271
 URL: https://issues.apache.org/jira/browse/YARN-9271
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Hung






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[VOTE] Propose to start new Hadoop sub project "submarine"

2019-02-01 Thread Wangda Tan
Hi all,

According to positive feedbacks from the thread [1]

This is vote thread to start a new subproject named "hadoop-submarine"
which follows the release process already established for ozone.

The vote runs for usual 7 days, which ends at Feb 8th 5 PM PDT.

Thanks,
Wangda Tan

[1]
https://lists.apache.org/thread.html/f864461eb188bd12859d51b0098ec38942c4429aae7e4d001a633d96@%3Cyarn-dev.hadoop.apache.org%3E


Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Eric Yang
If HDFS or YARN breaks compatibility with Submarine, it will require to make 
release to catch up with the latest Hadoop changes.  On hadoop.apache.org 
website, the latest news may always have Submarine on top to repair 
compatibility with latest of Hadoop.  This may overwhelm any interesting news 
that may happen in Hadoop space.  I don’t like to see that happen, but 
unavoidable with independent release cycle.  Maybe there is a good way to avoid 
this with help of release manager to ensure that Hadoop/Submarine don’t break 
compatibility frequently.

For me to lift my veto, release managers of independent release cycles need to 
take responsibility to ensure X version of Hadoop is tested with Y version of 
Submarine.  Release managers will have to do more work to ensure the defined 
combination works.  With the greater responsibility of release management comes 
with its own reward.  Seasoned PMC may be nominated to become Apache Member, 
which will help with Submarine to enter Apache Incubator when time is right.  
Hence, I will withdraw my veto and let Submarine set its own course.

Good luck Wangda.

Regards,
Eric

From: Wangda Tan 
Date: Friday, February 1, 2019 at 10:52 AM
To: Eric Yang 
Cc: Weiwei Yang , Xun Liu , Hadoop 
Common , "yarn-dev@hadoop.apache.org" 
, Hdfs-dev , 
"mapreduce-...@hadoop.apache.org" 
Subject: Re: [DISCUSS] Making submarine to different release model like Ozone

Thanks everyone for sharing thoughts!

Eric, appreciate your suggestions. But there are many examples to have separate 
releases, like Hive's storage API, OZone, etc. For loosely coupled 
sub-projects, it gonna be great (at least for most of the users) to have 
separate releases so new features can be faster consumed and iterated. From 
above feedbacks from developers and users, I think it is also what people want.

Another concern you mentioned is Submarine is aligned with Hadoop project 
goals. From feedbacks we can see, it attracts companies continue using Hadoop 
to solve their ML/DL requirements, it also created a good feedback loop, many 
issues faced, and some new functionalities added by Submarine went back to 
Hadoop. Such as localization files, directories. GPU topology related 
enhancement, etc.

We will definitely use this sub-project opportunity to fast grow both Submarine 
and Hadoop, try to get fast release cycles for both of the projects. And for 
your suggestion about Apache incubator, we can reconsider it once Submarine 
becomes a more independent project, now it is still too small and too much 
overhead to go through the process, I don't want to stop the fast-growing 
community for months to go through incubator process for now.

I really hope my comment can help you reconsider the veto. :)

Thanks,
Wangda

On Fri, Feb 1, 2019 at 9:39 AM Eric Yang 
mailto:ey...@hortonworks.com>> wrote:
Submarine is an application built for YARN framework, but it does not have 
strong dependency on YARN development.  For this kind of projects, it would be 
best to enter Apache Incubator cycles to create a new community.  Apache 
commons is the only project other than Incubator that has independent release 
cycles.  The collection is large, and the project goal is ambitious.  No one 
really knows which component works with each other in Apache commons.  Hadoop 
is a much more focused project on distributed computing framework and not 
incubation sandbox.  For alignment with Hadoop goals, and we want to prevent 
Hadoop project to be overloaded while allowing good ideas to be carried 
forwarded in Apache incubator.  Put on my Apache Member hat, my vote is -1 to 
allow more independent subproject release cycle in Hadoop project that does not 
align with Hadoop project goals.

Apache incubator process is highly recommended for Submarine: 
https://incubator.apache.org/policy/process.html This allows Submarine to 
develop for older version of Hadoop like Spark works with multiple versions of 
Hadoop.

Regards,
Eric

On 1/31/19, 10:51 PM, "Weiwei Yang" 
mailto:abvclo...@gmail.com>> wrote:

Thanks for proposing this Wangda, my +1 as well.
It is amazing to see the progress made in Submarine last year, the 
community grows fast and quiet collaborative. I can see the reasons to get it 
release faster in its own cycle. And at the same time, the Ozone way works very 
well.

—
Weiwei
On Feb 1, 2019, 10:49 AM +0800, Xun Liu 
mailto:neliu...@163.com>>, wrote:
> +1
>
> Hello everyone,
>
> I am Xun Liu, the head of the machine learning team at Netease Research 
Institute. I quite agree with Wangda.
>
> Our team is very grateful for getting Submarine machine learning engine 
from the community.
> We are heavy users of Submarine.
> Because Submarine fits into the direction of our big data team's hadoop 
technology stack,
> It avoids the needs to increase the manpower investment in learning other 
container scheduling systems.
> The important thing is that we can use a comm

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Wangda Tan
Eric,
Thanks for your reconsideration. We will definitely try best to not break
compatibilities, etc. like how we did to other components!

Really appreciate everybody's support, thoughts, suggestions shared on this
thread. Given the discussion went very positive, I will go ahead to send a
voting thread.

Best,
Wangda

On Fri, Feb 1, 2019 at 2:06 PM Eric Yang  wrote:

> If HDFS or YARN breaks compatibility with Submarine, it will require to
> make release to catch up with the latest Hadoop changes.  On
> hadoop.apache.org website, the latest news may always have Submarine on
> top to repair compatibility with latest of Hadoop.  This may overwhelm any
> interesting news that may happen in Hadoop space.  I don’t like to see that
> happen, but unavoidable with independent release cycle.  Maybe there is a
> good way to avoid this with help of release manager to ensure that
> Hadoop/Submarine don’t break compatibility frequently.
>
>
>
> For me to lift my veto, release managers of independent release cycles
> need to take responsibility to ensure X version of Hadoop is tested with Y
> version of Submarine.  Release managers will have to do more work to ensure
> the defined combination works.  With the greater responsibility of release
> management comes with its own reward.  Seasoned PMC may be nominated to
> become Apache Member, which will help with Submarine to enter Apache
> Incubator when time is right.  Hence, I will withdraw my veto and let
> Submarine set its own course.
>
>
>
> Good luck Wangda.
>
>
>
> Regards,
>
> Eric
>
>
>
> *From: *Wangda Tan 
> *Date: *Friday, February 1, 2019 at 10:52 AM
> *To: *Eric Yang 
> *Cc: *Weiwei Yang , Xun Liu ,
> Hadoop Common , "yarn-dev@hadoop.apache.org"
> , Hdfs-dev , "
> mapreduce-...@hadoop.apache.org" 
> *Subject: *Re: [DISCUSS] Making submarine to different release model like
> Ozone
>
>
>
> Thanks everyone for sharing thoughts!
>
>
>
> Eric, appreciate your suggestions. But there are many examples to have
> separate releases, like Hive's storage API, OZone, etc. For loosely coupled
> sub-projects, it gonna be great (at least for most of the users) to have
> separate releases so new features can be faster consumed and iterated. From
> above feedbacks from developers and users, I think it is also what people
> want.
>
>
>
> Another concern you mentioned is Submarine is aligned with Hadoop project
> goals. From feedbacks we can see, it attracts companies continue using
> Hadoop to solve their ML/DL requirements, it also created a good feedback
> loop, many issues faced, and some new functionalities added by Submarine
> went back to Hadoop. Such as localization files, directories. GPU topology
> related enhancement, etc.
>
>
>
> We will definitely use this sub-project opportunity to fast grow both
> Submarine and Hadoop, try to get fast release cycles for both of the
> projects. And for your suggestion about Apache incubator, we can reconsider
> it once Submarine becomes a more independent project, now it is still too
> small and too much overhead to go through the process, I don't want to stop
> the fast-growing community for months to go through incubator process for
> now.
>
>
>
> I really hope my comment can help you reconsider the veto. :)
>
>
>
> Thanks,
>
> Wangda
>
>
>
> On Fri, Feb 1, 2019 at 9:39 AM Eric Yang  wrote:
>
> Submarine is an application built for YARN framework, but it does not have
> strong dependency on YARN development.  For this kind of projects, it would
> be best to enter Apache Incubator cycles to create a new community.  Apache
> commons is the only project other than Incubator that has independent
> release cycles.  The collection is large, and the project goal is
> ambitious.  No one really knows which component works with each other in
> Apache commons.  Hadoop is a much more focused project on distributed
> computing framework and not incubation sandbox.  For alignment with Hadoop
> goals, and we want to prevent Hadoop project to be overloaded while
> allowing good ideas to be carried forwarded in Apache incubator.  Put on my
> Apache Member hat, my vote is -1 to allow more independent subproject
> release cycle in Hadoop project that does not align with Hadoop project
> goals.
>
> Apache incubator process is highly recommended for Submarine:
> https://incubator.apache.org/policy/process.html This allows Submarine to
> develop for older version of Hadoop like Spark works with multiple versions
> of Hadoop.
>
> Regards,
> Eric
>
> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>
> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last year, the
> community grows fast and quiet collaborative. I can see the reasons to get
> it release faster in its own cycle. And at the same time, the Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Elek, Marton
+1.

I like the idea.

For me, submarine/ML-job-execution seems to be a natural extension of
the existing Hadoop/Yarn capabilities.

And like the proposed project structure / release lifecycle, too. I
think it's better to be more modularized but keep the development in the
same project. IMHO it worked well with the Ozone releases. We can do
more frequent releases and support multiple versions of core hadoop but
the tested new improvements could be moved back to the hadoop-common.

Marton

On 1/31/19 7:53 PM, Wangda Tan wrote:
> Hi devs,
> 
> Since we started submarine-related effort last year, we received a lot of
> feedbacks, several companies (such as Netease, China Mobile, etc.)  are
> trying to deploy Submarine to their Hadoop cluster along with big data
> workloads. Linkedin also has big interests to contribute a Submarine TonY (
> https://github.com/linkedin/TonY) runtime to allow users to use the same
> interface.
> 
> From what I can see, there're several issues of putting Submarine under
> yarn-applications directory and have same release cycle with Hadoop:
> 
> 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> 2019. Because of non-predictable blockers and security issues, it got
> delayed a lot. We need to iterate submarine fast at this point.
> 
> 2) We also see a lot of requirements to use Submarine on older Hadoop
> releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> short time, but the requirement to run deep learning is urgent to them. We
> should decouple Submarine from Hadoop version.
> 
> And why we wanna to keep it within Hadoop? First, Submarine included some
> innovation parts such as enhancements of user experiences for YARN
> services/containerization support which we can add it back to Hadoop later
> to address common requirements. In addition to that, we have a big overlap
> in the community developing and using it.
> 
> There're several proposals we have went through during Ozone merge to trunk
> discussion:
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
> 
> I propose to adopt Ozone model: which is the same master branch, different
> release cycle, and different release branch. It is a great example to show
> agile release we can do (2 Ozone releases after Oct 2018) with less
> overhead to setup CI, projects, etc.
> 
> *Links:*
> - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> - Design doc
> 
> - User doc
> 
> (3.2.0
> release)
> - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop
> ,
> (Chinese Translation: Link )
> - Talks: Strata Data Conf NY
> 
> 
> Thoughts?
> 
> Thanks,
> Wangda Tan
> 

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Hanisha Koneru
This is a great proposal. +1.

Thanks,
Hanisha









On 2/1/19, 11:04 AM, "Bharat Viswanadham"  wrote:

>Thank You Wangda for driving this discussion.
>+1 for a separate release for submarine.
>Having own release cadence will help iterate the project to grow at a faster 
>pace and also get the new features in hand to the users, and get their 
>feedback quickly.
>
>
>Thanks,
>Bharat
>
>
>
>
>On 2/1/19, 10:54 AM, "Ajay Kumar"  wrote:
>
>+1, Thanks for driving this. With rise of use cases running ML along with 
> traditional applications this will be of great help.
>
>Thanks,
>Ajay   
>
>On 2/1/19, 10:49 AM, "Suma Shivaprasad"  
> wrote:
>
>+1. Thanks for bringing this up Wangda.
>
>Makes sense to have Submarine follow its own release cadence given the 
> good
>momentum/adoption so far. Also, making it run with older versions of 
> Hadoop
>would drive higher adoption.
>
>Suma
>
>On Fri, Feb 1, 2019 at 9:40 AM Eric Yang  wrote:
>
>> Submarine is an application built for YARN framework, but it does 
> not have
>> strong dependency on YARN development.  For this kind of projects, 
> it would
>> be best to enter Apache Incubator cycles to create a new community.  
> Apache
>> commons is the only project other than Incubator that has independent
>> release cycles.  The collection is large, and the project goal is
>> ambitious.  No one really knows which component works with each 
> other in
>> Apache commons.  Hadoop is a much more focused project on distributed
>> computing framework and not incubation sandbox.  For alignment with 
> Hadoop
>> goals, and we want to prevent Hadoop project to be overloaded while
>> allowing good ideas to be carried forwarded in Apache incubator.  
> Put on my
>> Apache Member hat, my vote is -1 to allow more independent subproject
>> release cycle in Hadoop project that does not align with Hadoop 
> project
>> goals.
>>
>> Apache incubator process is highly recommended for Submarine:
>> https://incubator.apache.org/policy/process.html This allows 
> Submarine to
>> develop for older version of Hadoop like Spark works with multiple 
> versions
>> of Hadoop.
>>
>> Regards,
>> Eric
>>
>> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>>
>> Thanks for proposing this Wangda, my +1 as well.
>> It is amazing to see the progress made in Submarine last year, 
> the
>> community grows fast and quiet collaborative. I can see the reasons 
> to get
>> it release faster in its own cycle. And at the same time, the Ozone 
> way
>> works very well.
>>
>> —
>> Weiwei
>> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , 
> wrote:
>> > +1
>> >
>> > Hello everyone,
>> >
>> > I am Xun Liu, the head of the machine learning team at Netease
>> Research Institute. I quite agree with Wangda.
>> >
>> > Our team is very grateful for getting Submarine machine 
> learning
>> engine from the community.
>> > We are heavy users of Submarine.
>> > Because Submarine fits into the direction of our big data 
> team's
>> hadoop technology stack,
>> > It avoids the needs to increase the manpower investment in 
> learning
>> other container scheduling systems.
>> > The important thing is that we can use a common YARN cluster 
> to run
>> machine learning,
>> > which makes the utilization of server resources more 
> efficient, and
>> reserves a lot of human and material resources in our previous years.
>> >
>> > Our team have finished the test and deployment of the 
> Submarine and
>> will provide the service to our e-commerce department (
>> http://www.kaola.com/) shortly.
>> >
>> > We also plan to provides the Submarine engine in our existing 
> YARN
>> cluster in the next six months.
>> > Because we have a lot of product departments need to use 
> machine
>> learning services,
>> > for example:
>> > 1) Game department (http://game.163.com/) needs AI battle 
> training,
>> > 2) News department (http://www.163.com) needs news 
> recommendation,
>> > 3) Mailbox department (http://www.163.com) requires anti-spam 
> and
>> illegal detection,
>> > 4) Music department (https://music.163.com/) requires music
>> recommendation,
>> > 5) Education department (http://www.youdao.com) requires voice
>> recognition,
>> > 6) Massive Open Online Courses (https://open.163.com/

Re: [DISCUSS] Moving branch-2 to java 8

2019-02-01 Thread Jonathan Hung
Thanks Vinod and Steve, agreed about java7 compile compatibility. At least
for now, we should be able to maintain java7 source compatibility and run
tests on java8. There's a test run here:
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86-jhung/46/
which calls a java8 specific API, installs both openjdk7/openjdk8 in the
dockerfile, compiles on both versions, and tests on just java8 (via
--multijdkdirs=/usr/lib/jvm/java-7-openjdk-amd64,/usr/lib/jvm/java-8-openjdk-amd64
and --multijdktests=compile). If we eventually decide it's too much of a
pain to maintain java7 source compatibility we can do that at a later point.

Also based on discussion with others in the community at the contributors
meetup this past Wednesday, seems we are generally in favor of testing
against java8. I'll start a vote soon.

Jonathan Hung


On Tue, Jan 29, 2019 at 4:11 AM Steve Loughran 
wrote:

> branch-2 is the JDK 7 branch, but for a long time I (and presumably
> others) have relied on jenkins to keep us honest by doing that build and
> test
>
> right now, we can't do that any more, due to jdk7 bugs which will never be
> fixed by oracle, or at least, not in a public release.
>
> If we can still do the compile in java 7 language and link to java 7 JDK,
> then that bit of the release is good -then java 8 can be used for that test
>
> Ultimately, we're going to be forced onto java 8 just because all our
> dependencies have moved onto it, and some CVE will force us to move.
>
> At which point, I think its time to declare branch-2 dead. It's had a
> great life, but trying to keep java 7 support alive isn't sustainable. Not
> just in this testing, but
> cherrypicking patches back gets more and more difficult -branch-3 has
> moved on in both use of java 8 language, and in the codebase in general.
>
> > On 28 Jan 2019, at 20:18, Vinod Kumar Vavilapalli 
> wrote:
> >
> > The community made a decision long time ago that we'd like to keep the
> compatibility & so tie branch-2 to Java 7, but do Java 8+ only work on 3.x.
> >
> > I always assumed that most (all?) downstream users build branch-2 on JDK
> 7 only, can anyone confirm? If so, there may be an easier way to address
> these test issues.
> >
> > +Vinod
> >
> >> On Jan 28, 2019, at 11:24 AM, Jonathan Hung 
> wrote:
> >>
> >> Hi folks,
> >>
> >> Forking a discussion based on HADOOP-15711. To summarize, there are
> issues
> >> with branch-2 tests running on java 7 (openjdk) which don't exist on
> java
> >> 8. From our testing, the build can pass with openjdk 8.
> >>
> >> For branch-3, the work to move the build to use java 8 was done in
> >> HADOOP-14816 as part of the Dockerfile OS version change. HADOOP-16053
> was
> >> filed to backport this OS version change to branch-2 (but without the
> java
> >> 7 -> java 8 change). So my proposal is to also make the java 7 -> java 8
> >> version change in branch-2.
> >>
> >> As mentioned in HADOOP-15711, the main issue is around source and binary
> >> compatibility. I don't currently have a great answer, but one initial
> >> thought is to build source/binary against java 7 to ensure compatibility
> >> and run the rest of the build as java 8.
> >>
> >> Thoughts?
> >>
> >> Jonathan Hung
> >
> >
> > -
> > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> >
>
>


Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Xiaoyu Yao
+1, thanks for bringing this up, Wangda. This will help expanding the Hadoop 
ecosystem by supporting new AI/ML workloads. 

Thanks,
Xiaoyu
On 2/1/19, 10:58 AM, "Dinesh Chitlangia"  wrote:

+1 This is a fantastic recommendation given the increasing interest in ML 
across the globe.

Thanks,
Dinesh



On 2/1/19, 1:54 PM, "Ajay Kumar"  wrote:

+1, Thanks for driving this. With rise of use cases running ML along 
with traditional applications this will be of great help.

Thanks,
Ajay   

On 2/1/19, 10:49 AM, "Suma Shivaprasad"  
wrote:

+1. Thanks for bringing this up Wangda.

Makes sense to have Submarine follow its own release cadence given 
the good
momentum/adoption so far. Also, making it run with older versions 
of Hadoop
would drive higher adoption.

Suma

On Fri, Feb 1, 2019 at 9:40 AM Eric Yang  
wrote:

> Submarine is an application built for YARN framework, but it does 
not have
> strong dependency on YARN development.  For this kind of 
projects, it would
> be best to enter Apache Incubator cycles to create a new 
community.  Apache
> commons is the only project other than Incubator that has 
independent
> release cycles.  The collection is large, and the project goal is
> ambitious.  No one really knows which component works with each 
other in
> Apache commons.  Hadoop is a much more focused project on 
distributed
> computing framework and not incubation sandbox.  For alignment 
with Hadoop
> goals, and we want to prevent Hadoop project to be overloaded 
while
> allowing good ideas to be carried forwarded in Apache incubator.  
Put on my
> Apache Member hat, my vote is -1 to allow more independent 
subproject
> release cycle in Hadoop project that does not align with Hadoop 
project
> goals.
>
> Apache incubator process is highly recommended for Submarine:
> https://incubator.apache.org/policy/process.html This allows 
Submarine to
> develop for older version of Hadoop like Spark works with 
multiple versions
> of Hadoop.
>
> Regards,
> Eric
>
> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>
> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last 
year, the
> community grows fast and quiet collaborative. I can see the 
reasons to get
> it release faster in its own cycle. And at the same time, the 
Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , 
wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the machine learning team at 
Netease
> Research Institute. I quite agree with Wangda.
> >
> > Our team is very grateful for getting Submarine machine 
learning
> engine from the community.
> > We are heavy users of Submarine.
> > Because Submarine fits into the direction of our big data 
team's
> hadoop technology stack,
> > It avoids the needs to increase the manpower investment in 
learning
> other container scheduling systems.
> > The important thing is that we can use a common YARN 
cluster to run
> machine learning,
> > which makes the utilization of server resources more 
efficient, and
> reserves a lot of human and material resources in our previous 
years.
> >
> > Our team have finished the test and deployment of the 
Submarine and
> will provide the service to our e-commerce department (
> http://www.kaola.com/) shortly.
> >
> > We also plan to provides the Submarine engine in our 
existing YARN
> cluster in the next six months.
> > Because we have a lot of product departments need to use 
machine
> learning services,
> > for example:
> > 1) Game department (http://game.163.com/) needs AI battle 
training,
> > 2) News department (http://www.163.com) needs news 
recommendation,
> > 3) Mailbox department (http://www.163.com) requires 
anti-spam and
> illegal detection,
> > 4) Music department (https://music.163.com/) requires music
> recommendation,
> > 5) Educatio

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Ajay Kumar
+1, Thanks for driving this. With rise of use cases running ML along with 
traditional applications this will be of great help.

Thanks,
Ajay   

On 2/1/19, 10:49 AM, "Suma Shivaprasad"  wrote:

+1. Thanks for bringing this up Wangda.

Makes sense to have Submarine follow its own release cadence given the good
momentum/adoption so far. Also, making it run with older versions of Hadoop
would drive higher adoption.

Suma

On Fri, Feb 1, 2019 at 9:40 AM Eric Yang  wrote:

> Submarine is an application built for YARN framework, but it does not have
> strong dependency on YARN development.  For this kind of projects, it 
would
> be best to enter Apache Incubator cycles to create a new community.  
Apache
> commons is the only project other than Incubator that has independent
> release cycles.  The collection is large, and the project goal is
> ambitious.  No one really knows which component works with each other in
> Apache commons.  Hadoop is a much more focused project on distributed
> computing framework and not incubation sandbox.  For alignment with Hadoop
> goals, and we want to prevent Hadoop project to be overloaded while
> allowing good ideas to be carried forwarded in Apache incubator.  Put on 
my
> Apache Member hat, my vote is -1 to allow more independent subproject
> release cycle in Hadoop project that does not align with Hadoop project
> goals.
>
> Apache incubator process is highly recommended for Submarine:
> https://incubator.apache.org/policy/process.html This allows Submarine to
> develop for older version of Hadoop like Spark works with multiple 
versions
> of Hadoop.
>
> Regards,
> Eric
>
> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>
> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last year, the
> community grows fast and quiet collaborative. I can see the reasons to get
> it release faster in its own cycle. And at the same time, the Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the machine learning team at Netease
> Research Institute. I quite agree with Wangda.
> >
> > Our team is very grateful for getting Submarine machine learning
> engine from the community.
> > We are heavy users of Submarine.
> > Because Submarine fits into the direction of our big data team's
> hadoop technology stack,
> > It avoids the needs to increase the manpower investment in learning
> other container scheduling systems.
> > The important thing is that we can use a common YARN cluster to run
> machine learning,
> > which makes the utilization of server resources more efficient, and
> reserves a lot of human and material resources in our previous years.
> >
> > Our team have finished the test and deployment of the Submarine and
> will provide the service to our e-commerce department (
> http://www.kaola.com/) shortly.
> >
> > We also plan to provides the Submarine engine in our existing YARN
> cluster in the next six months.
> > Because we have a lot of product departments need to use machine
> learning services,
> > for example:
> > 1) Game department (http://game.163.com/) needs AI battle training,
> > 2) News department (http://www.163.com) needs news recommendation,
> > 3) Mailbox department (http://www.163.com) requires anti-spam and
> illegal detection,
> > 4) Music department (https://music.163.com/) requires music
> recommendation,
> > 5) Education department (http://www.youdao.com) requires voice
> recognition,
> > 6) Massive Open Online Courses (https://open.163.com/) requires
> multilingual translation and so on.
> >
> > If Submarine can be released independently like Ozone, it will help
> us quickly get the latest features and improvements, and it will be great
> helpful to our team and users.
> >
> > Thanks hadoop Community!
> >
> >
> > > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> > >
> > > Hi devs,
> > >
> > > Since we started submarine-related effort last year, we received a
> lot of
> > > feedbacks, several companies (such as Netease, China Mobile, etc.)
> are
> > > trying to deploy Submarine to their Hadoop cluster along with big
> data
> > > workloads. Linkedin also has big interests to contribute a
> Submarine TonY (
> > > https://github.com/linkedin/TonY) runtime to allow users to use
> the same
> > > interface.
> > >
> > > F

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Wangda Tan
Thanks everyone for sharing thoughts!

Eric, appreciate your suggestions. But there are many examples to have
separate releases, like Hive's storage API, OZone, etc. For loosely coupled
sub-projects, it gonna be great (at least for most of the users) to have
separate releases so new features can be faster consumed and iterated. From
above feedbacks from developers and users, I think it is also what people
want.

Another concern you mentioned is Submarine is aligned with Hadoop project
goals. From feedbacks we can see, it attracts companies continue using
Hadoop to solve their ML/DL requirements, it also created a good feedback
loop, many issues faced, and some new functionalities added by Submarine
went back to Hadoop. Such as localization files, directories. GPU topology
related enhancement, etc.

We will definitely use this sub-project opportunity to fast grow both
Submarine and Hadoop, try to get fast release cycles for both of the
projects. And for your suggestion about Apache incubator, we can reconsider
it once Submarine becomes a more independent project, now it is still too
small and too much overhead to go through the process, I don't want to stop
the fast-growing community for months to go through incubator process for
now.

I really hope my comment can help you reconsider the veto. :)

Thanks,
Wangda

On Fri, Feb 1, 2019 at 9:39 AM Eric Yang  wrote:

> Submarine is an application built for YARN framework, but it does not have
> strong dependency on YARN development.  For this kind of projects, it would
> be best to enter Apache Incubator cycles to create a new community.  Apache
> commons is the only project other than Incubator that has independent
> release cycles.  The collection is large, and the project goal is
> ambitious.  No one really knows which component works with each other in
> Apache commons.  Hadoop is a much more focused project on distributed
> computing framework and not incubation sandbox.  For alignment with Hadoop
> goals, and we want to prevent Hadoop project to be overloaded while
> allowing good ideas to be carried forwarded in Apache incubator.  Put on my
> Apache Member hat, my vote is -1 to allow more independent subproject
> release cycle in Hadoop project that does not align with Hadoop project
> goals.
>
> Apache incubator process is highly recommended for Submarine:
> https://incubator.apache.org/policy/process.html This allows Submarine to
> develop for older version of Hadoop like Spark works with multiple versions
> of Hadoop.
>
> Regards,
> Eric
>
> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>
> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last year, the
> community grows fast and quiet collaborative. I can see the reasons to get
> it release faster in its own cycle. And at the same time, the Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the machine learning team at Netease
> Research Institute. I quite agree with Wangda.
> >
> > Our team is very grateful for getting Submarine machine learning
> engine from the community.
> > We are heavy users of Submarine.
> > Because Submarine fits into the direction of our big data team's
> hadoop technology stack,
> > It avoids the needs to increase the manpower investment in learning
> other container scheduling systems.
> > The important thing is that we can use a common YARN cluster to run
> machine learning,
> > which makes the utilization of server resources more efficient, and
> reserves a lot of human and material resources in our previous years.
> >
> > Our team have finished the test and deployment of the Submarine and
> will provide the service to our e-commerce department (
> http://www.kaola.com/) shortly.
> >
> > We also plan to provides the Submarine engine in our existing YARN
> cluster in the next six months.
> > Because we have a lot of product departments need to use machine
> learning services,
> > for example:
> > 1) Game department (http://game.163.com/) needs AI battle training,
> > 2) News department (http://www.163.com) needs news recommendation,
> > 3) Mailbox department (http://www.163.com) requires anti-spam and
> illegal detection,
> > 4) Music department (https://music.163.com/) requires music
> recommendation,
> > 5) Education department (http://www.youdao.com) requires voice
> recognition,
> > 6) Massive Open Online Courses (https://open.163.com/) requires
> multilingual translation and so on.
> >
> > If Submarine can be released independently like Ozone, it will help
> us quickly get the latest features and improvements, and it will be great
> helpful to our team and users.
> >
> > Thanks hadoop Community!
> >
> >
> > > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> > >
> > > Hi d

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Suma Shivaprasad
+1. Thanks for bringing this up Wangda.

Makes sense to have Submarine follow its own release cadence given the good
momentum/adoption so far. Also, making it run with older versions of Hadoop
would drive higher adoption.

Suma

On Fri, Feb 1, 2019 at 9:40 AM Eric Yang  wrote:

> Submarine is an application built for YARN framework, but it does not have
> strong dependency on YARN development.  For this kind of projects, it would
> be best to enter Apache Incubator cycles to create a new community.  Apache
> commons is the only project other than Incubator that has independent
> release cycles.  The collection is large, and the project goal is
> ambitious.  No one really knows which component works with each other in
> Apache commons.  Hadoop is a much more focused project on distributed
> computing framework and not incubation sandbox.  For alignment with Hadoop
> goals, and we want to prevent Hadoop project to be overloaded while
> allowing good ideas to be carried forwarded in Apache incubator.  Put on my
> Apache Member hat, my vote is -1 to allow more independent subproject
> release cycle in Hadoop project that does not align with Hadoop project
> goals.
>
> Apache incubator process is highly recommended for Submarine:
> https://incubator.apache.org/policy/process.html This allows Submarine to
> develop for older version of Hadoop like Spark works with multiple versions
> of Hadoop.
>
> Regards,
> Eric
>
> On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:
>
> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last year, the
> community grows fast and quiet collaborative. I can see the reasons to get
> it release faster in its own cycle. And at the same time, the Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the machine learning team at Netease
> Research Institute. I quite agree with Wangda.
> >
> > Our team is very grateful for getting Submarine machine learning
> engine from the community.
> > We are heavy users of Submarine.
> > Because Submarine fits into the direction of our big data team's
> hadoop technology stack,
> > It avoids the needs to increase the manpower investment in learning
> other container scheduling systems.
> > The important thing is that we can use a common YARN cluster to run
> machine learning,
> > which makes the utilization of server resources more efficient, and
> reserves a lot of human and material resources in our previous years.
> >
> > Our team have finished the test and deployment of the Submarine and
> will provide the service to our e-commerce department (
> http://www.kaola.com/) shortly.
> >
> > We also plan to provides the Submarine engine in our existing YARN
> cluster in the next six months.
> > Because we have a lot of product departments need to use machine
> learning services,
> > for example:
> > 1) Game department (http://game.163.com/) needs AI battle training,
> > 2) News department (http://www.163.com) needs news recommendation,
> > 3) Mailbox department (http://www.163.com) requires anti-spam and
> illegal detection,
> > 4) Music department (https://music.163.com/) requires music
> recommendation,
> > 5) Education department (http://www.youdao.com) requires voice
> recognition,
> > 6) Massive Open Online Courses (https://open.163.com/) requires
> multilingual translation and so on.
> >
> > If Submarine can be released independently like Ozone, it will help
> us quickly get the latest features and improvements, and it will be great
> helpful to our team and users.
> >
> > Thanks hadoop Community!
> >
> >
> > > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> > >
> > > Hi devs,
> > >
> > > Since we started submarine-related effort last year, we received a
> lot of
> > > feedbacks, several companies (such as Netease, China Mobile, etc.)
> are
> > > trying to deploy Submarine to their Hadoop cluster along with big
> data
> > > workloads. Linkedin also has big interests to contribute a
> Submarine TonY (
> > > https://github.com/linkedin/TonY) runtime to allow users to use
> the same
> > > interface.
> > >
> > > From what I can see, there're several issues of putting Submarine
> under
> > > yarn-applications directory and have same release cycle with
> Hadoop:
> > >
> > > 1) We started 3.2.0 release at Sep 2018, but the release is done
> at Jan
> > > 2019. Because of non-predictable blockers and security issues, it
> got
> > > delayed a lot. We need to iterate submarine fast at this point.
> > >
> > > 2) We also see a lot of requirements to use Submarine on older
> Hadoop
> > > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x
> in a
> > > short time, but the requirement t

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread John Zhuge
+1

Does Submarine support Jupyter?

On Fri, Feb 1, 2019 at 8:54 AM Zhe Zhang  wrote:

> +1 on the proposal and looking forward to the progress of the project!
>
> On Thu, Jan 31, 2019 at 10:51 PM Weiwei Yang  wrote:
>
> > Thanks for proposing this Wangda, my +1 as well.
> > It is amazing to see the progress made in Submarine last year, the
> > community grows fast and quiet collaborative. I can see the reasons to
> get
> > it release faster in its own cycle. And at the same time, the Ozone way
> > works very well.
> >
> > —
> > Weiwei
> > On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > > +1
> > >
> > > Hello everyone,
> > >
> > > I am Xun Liu, the head of the machine learning team at Netease Research
> > Institute. I quite agree with Wangda.
> > >
> > > Our team is very grateful for getting Submarine machine learning engine
> > from the community.
> > > We are heavy users of Submarine.
> > > Because Submarine fits into the direction of our big data team's hadoop
> > technology stack,
> > > It avoids the needs to increase the manpower investment in learning
> > other container scheduling systems.
> > > The important thing is that we can use a common YARN cluster to run
> > machine learning,
> > > which makes the utilization of server resources more efficient, and
> > reserves a lot of human and material resources in our previous years.
> > >
> > > Our team have finished the test and deployment of the Submarine and
> will
> > provide the service to our e-commerce department (http://www.kaola.com/)
> > shortly.
> > >
> > > We also plan to provides the Submarine engine in our existing YARN
> > cluster in the next six months.
> > > Because we have a lot of product departments need to use machine
> > learning services,
> > > for example:
> > > 1) Game department (http://game.163.com/) needs AI battle training,
> > > 2) News department (http://www.163.com) needs news recommendation,
> > > 3) Mailbox department (http://www.163.com) requires anti-spam and
> > illegal detection,
> > > 4) Music department (https://music.163.com/) requires music
> > recommendation,
> > > 5) Education department (http://www.youdao.com) requires voice
> > recognition,
> > > 6) Massive Open Online Courses (https://open.163.com/) requires
> > multilingual translation and so on.
> > >
> > > If Submarine can be released independently like Ozone, it will help us
> > quickly get the latest features and improvements, and it will be great
> > helpful to our team and users.
> > >
> > > Thanks hadoop Community!
> > >
> > >
> > > > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> > > >
> > > > Hi devs,
> > > >
> > > > Since we started submarine-related effort last year, we received a
> lot
> > of
> > > > feedbacks, several companies (such as Netease, China Mobile, etc.)
> are
> > > > trying to deploy Submarine to their Hadoop cluster along with big
> data
> > > > workloads. Linkedin also has big interests to contribute a Submarine
> > TonY (
> > > > https://github.com/linkedin/TonY) runtime to allow users to use the
> > same
> > > > interface.
> > > >
> > > > From what I can see, there're several issues of putting Submarine
> under
> > > > yarn-applications directory and have same release cycle with Hadoop:
> > > >
> > > > 1) We started 3.2.0 release at Sep 2018, but the release is done at
> Jan
> > > > 2019. Because of non-predictable blockers and security issues, it got
> > > > delayed a lot. We need to iterate submarine fast at this point.
> > > >
> > > > 2) We also see a lot of requirements to use Submarine on older Hadoop
> > > > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x
> in a
> > > > short time, but the requirement to run deep learning is urgent to
> > them. We
> > > > should decouple Submarine from Hadoop version.
> > > >
> > > > And why we wanna to keep it within Hadoop? First, Submarine included
> > some
> > > > innovation parts such as enhancements of user experiences for YARN
> > > > services/containerization support which we can add it back to Hadoop
> > later
> > > > to address common requirements. In addition to that, we have a big
> > overlap
> > > > in the community developing and using it.
> > > >
> > > > There're several proposals we have went through during Ozone merge to
> > trunk
> > > > discussion:
> > > >
> >
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
> > > >
> > > > I propose to adopt Ozone model: which is the same master branch,
> > different
> > > > release cycle, and different release branch. It is a great example to
> > show
> > > > agile release we can do (2 Ozone releases after Oct 2018) with less
> > > > overhead to setup CI, projects, etc.
> > > >
> > > > *Links:*
> > > > - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> > > > - Design doc
> > > > <
> >
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
> > >
> > > > - User doc
> > > > <
> >
> https:/

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Eric Yang
Submarine is an application built for YARN framework, but it does not have 
strong dependency on YARN development.  For this kind of projects, it would be 
best to enter Apache Incubator cycles to create a new community.  Apache 
commons is the only project other than Incubator that has independent release 
cycles.  The collection is large, and the project goal is ambitious.  No one 
really knows which component works with each other in Apache commons.  Hadoop 
is a much more focused project on distributed computing framework and not 
incubation sandbox.  For alignment with Hadoop goals, and we want to prevent 
Hadoop project to be overloaded while allowing good ideas to be carried 
forwarded in Apache incubator.  Put on my Apache Member hat, my vote is -1 to 
allow more independent subproject release cycle in Hadoop project that does not 
align with Hadoop project goals.  

Apache incubator process is highly recommended for Submarine: 
https://incubator.apache.org/policy/process.html This allows Submarine to 
develop for older version of Hadoop like Spark works with multiple versions of 
Hadoop.

Regards,
Eric

On 1/31/19, 10:51 PM, "Weiwei Yang"  wrote:

Thanks for proposing this Wangda, my +1 as well.
It is amazing to see the progress made in Submarine last year, the 
community grows fast and quiet collaborative. I can see the reasons to get it 
release faster in its own cycle. And at the same time, the Ozone way works very 
well.

—
Weiwei
On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> +1
>
> Hello everyone,
>
> I am Xun Liu, the head of the machine learning team at Netease Research 
Institute. I quite agree with Wangda.
>
> Our team is very grateful for getting Submarine machine learning engine 
from the community.
> We are heavy users of Submarine.
> Because Submarine fits into the direction of our big data team's hadoop 
technology stack,
> It avoids the needs to increase the manpower investment in learning other 
container scheduling systems.
> The important thing is that we can use a common YARN cluster to run 
machine learning,
> which makes the utilization of server resources more efficient, and 
reserves a lot of human and material resources in our previous years.
>
> Our team have finished the test and deployment of the Submarine and will 
provide the service to our e-commerce department (http://www.kaola.com/) 
shortly.
>
> We also plan to provides the Submarine engine in our existing YARN 
cluster in the next six months.
> Because we have a lot of product departments need to use machine learning 
services,
> for example:
> 1) Game department (http://game.163.com/) needs AI battle training,
> 2) News department (http://www.163.com) needs news recommendation,
> 3) Mailbox department (http://www.163.com) requires anti-spam and illegal 
detection,
> 4) Music department (https://music.163.com/) requires music 
recommendation,
> 5) Education department (http://www.youdao.com) requires voice 
recognition,
> 6) Massive Open Online Courses (https://open.163.com/) requires 
multilingual translation and so on.
>
> If Submarine can be released independently like Ozone, it will help us 
quickly get the latest features and improvements, and it will be great helpful 
to our team and users.
>
> Thanks hadoop Community!
>
>
> > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> >
> > Hi devs,
> >
> > Since we started submarine-related effort last year, we received a lot 
of
> > feedbacks, several companies (such as Netease, China Mobile, etc.) are
> > trying to deploy Submarine to their Hadoop cluster along with big data
> > workloads. Linkedin also has big interests to contribute a Submarine 
TonY (
> > https://github.com/linkedin/TonY) runtime to allow users to use the same
> > interface.
> >
> > From what I can see, there're several issues of putting Submarine under
> > yarn-applications directory and have same release cycle with Hadoop:
> >
> > 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> > 2019. Because of non-predictable blockers and security issues, it got
> > delayed a lot. We need to iterate submarine fast at this point.
> >
> > 2) We also see a lot of requirements to use Submarine on older Hadoop
> > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> > short time, but the requirement to run deep learning is urgent to them. 
We
> > should decouple Submarine from Hadoop version.
> >
> > And why we wanna to keep it within Hadoop? First, Submarine included 
some
> > innovation parts such as enhancements of user experiences for YARN
> > services/containerization support which we can add it back to Hadoop 
later
> > to address common requirements. In addition to that, we have a big 
overlap
> > in the community

Re: [DISCUSS] Making submarine to different release model like Ozone

2019-02-01 Thread Zhe Zhang
+1 on the proposal and looking forward to the progress of the project!

On Thu, Jan 31, 2019 at 10:51 PM Weiwei Yang  wrote:

> Thanks for proposing this Wangda, my +1 as well.
> It is amazing to see the progress made in Submarine last year, the
> community grows fast and quiet collaborative. I can see the reasons to get
> it release faster in its own cycle. And at the same time, the Ozone way
> works very well.
>
> —
> Weiwei
> On Feb 1, 2019, 10:49 AM +0800, Xun Liu , wrote:
> > +1
> >
> > Hello everyone,
> >
> > I am Xun Liu, the head of the machine learning team at Netease Research
> Institute. I quite agree with Wangda.
> >
> > Our team is very grateful for getting Submarine machine learning engine
> from the community.
> > We are heavy users of Submarine.
> > Because Submarine fits into the direction of our big data team's hadoop
> technology stack,
> > It avoids the needs to increase the manpower investment in learning
> other container scheduling systems.
> > The important thing is that we can use a common YARN cluster to run
> machine learning,
> > which makes the utilization of server resources more efficient, and
> reserves a lot of human and material resources in our previous years.
> >
> > Our team have finished the test and deployment of the Submarine and will
> provide the service to our e-commerce department (http://www.kaola.com/)
> shortly.
> >
> > We also plan to provides the Submarine engine in our existing YARN
> cluster in the next six months.
> > Because we have a lot of product departments need to use machine
> learning services,
> > for example:
> > 1) Game department (http://game.163.com/) needs AI battle training,
> > 2) News department (http://www.163.com) needs news recommendation,
> > 3) Mailbox department (http://www.163.com) requires anti-spam and
> illegal detection,
> > 4) Music department (https://music.163.com/) requires music
> recommendation,
> > 5) Education department (http://www.youdao.com) requires voice
> recognition,
> > 6) Massive Open Online Courses (https://open.163.com/) requires
> multilingual translation and so on.
> >
> > If Submarine can be released independently like Ozone, it will help us
> quickly get the latest features and improvements, and it will be great
> helpful to our team and users.
> >
> > Thanks hadoop Community!
> >
> >
> > > 在 2019年2月1日,上午2:53,Wangda Tan  写道:
> > >
> > > Hi devs,
> > >
> > > Since we started submarine-related effort last year, we received a lot
> of
> > > feedbacks, several companies (such as Netease, China Mobile, etc.) are
> > > trying to deploy Submarine to their Hadoop cluster along with big data
> > > workloads. Linkedin also has big interests to contribute a Submarine
> TonY (
> > > https://github.com/linkedin/TonY) runtime to allow users to use the
> same
> > > interface.
> > >
> > > From what I can see, there're several issues of putting Submarine under
> > > yarn-applications directory and have same release cycle with Hadoop:
> > >
> > > 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> > > 2019. Because of non-predictable blockers and security issues, it got
> > > delayed a lot. We need to iterate submarine fast at this point.
> > >
> > > 2) We also see a lot of requirements to use Submarine on older Hadoop
> > > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> > > short time, but the requirement to run deep learning is urgent to
> them. We
> > > should decouple Submarine from Hadoop version.
> > >
> > > And why we wanna to keep it within Hadoop? First, Submarine included
> some
> > > innovation parts such as enhancements of user experiences for YARN
> > > services/containerization support which we can add it back to Hadoop
> later
> > > to address common requirements. In addition to that, we have a big
> overlap
> > > in the community developing and using it.
> > >
> > > There're several proposals we have went through during Ozone merge to
> trunk
> > > discussion:
> > >
> https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
> > >
> > > I propose to adopt Ozone model: which is the same master branch,
> different
> > > release cycle, and different release branch. It is a great example to
> show
> > > agile release we can do (2 Ozone releases after Oct 2018) with less
> > > overhead to setup CI, projects, etc.
> > >
> > > *Links:*
> > > - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> > > - Design doc
> > > <
> https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
> >
> > > - User doc
> > > <
> https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
> >
> > > (3.2.0
> > > release)
> > > - Blogposts, {Submarine} : Running deep learning workloads on Apache
> Hadoop
> > > <
> https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
> >,
> > > (Chinese Translat

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/

[Jan 31, 2019 3:55:29 AM] (sunilg) YARN-9099. 
GpuResourceAllocator#getReleasingGpus calculates number of
[Jan 31, 2019 1:51:31 PM] (elek) HDDS-956. MultipartUpload: List Parts for a 
Multipart upload key.
[Jan 31, 2019 6:06:05 PM] (inigoiri) HADOOP-16084. Fix the comment for getClass 
in Configuration. Contributed
[Jan 31, 2019 7:24:15 PM] (gifuma) YARN-9191. Add cli option in DS to support 
enforceExecutionType in
[Feb 1, 2019 12:07:24 AM] (weichiu) HDFS-14187. Make warning message more clear 
when there are not enough




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.util.TestReadWriteDiskValidator 
   hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   
hadoop.yarn.server.nodemanager.nodelabels.TestConfigurationNodeAttributesProvider
 
   
hadoop.yarn.server.resourcemanager.rmapp.attempt.TestRMAppAttemptTransitions 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-compile-javac-root.txt
  [336K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-patch-hadolint.txt
  [8.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-patch-pylint.txt
  [88K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/whitespace-eol.txt
  [9.3M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/whitespace-tabs.txt
  [1.1M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-hdds_client.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-hdds_framework.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-hdds_tools.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_common.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_ozonefs.txt
  [20K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_s3gateway.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/branch-findbugs-hadoop-ozone_tools.txt
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/diff-javadoc-javadoc-root.txt
  [752K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [164K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [328K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1034/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-ya

[jira] [Created] (YARN-9270) Minor cleanup in TestFpgaDiscoverer

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9270:
--

 Summary: Minor cleanup in TestFpgaDiscoverer
 Key: YARN-9270
 URL: https://issues.apache.org/jira/browse/YARN-9270
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Peter Bacsko


Let's do some cleanup in this class.

* {{testLinuxFpgaResourceDiscoverPluginConfig}} - this test should be split up 
to 5 different tests, because it tests 5 different scenarios.
* remove {{setNewEnvironmentHack()}} - too complicated. We can introduce a 
{{Function}} in the plugin class like {{Function envProvider = 
System::getenv()}} plus a setter method which allows the test to modify 
{{envProvider}}. Much simpler and straightfoward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9269) Minor cleanup in FpgaResourceAllocator

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9269:
--

 Summary: Minor cleanup in FpgaResourceAllocator
 Key: YARN-9269
 URL: https://issues.apache.org/jira/browse/YARN-9269
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Peter Bacsko


Some stuff that we observed:

* {{addFpga()}} - we check for duplicate devices, but we don't print any 
error/warning if there's any.
* {{findMatchedFpga()}} should be called findMatchingFpga(). Also, is this 
method even needed? We already receive an {{FpgaDevice}} instance in 
{{updateFpga()}} which I believe is the same that we're looking up.
* variable {{IPIDpreference}} is confusing
* {{availableFpga}} / {{usedFpgaByRequestor}} are instances of 
{{LinkedHashMap}}. What's the rationale behind this? Doesn't a simple 
{{HashMap}} suffice?
* {{usedFpgaByRequestor}} should be renamed, naming is a bit unclear
* {{allowedFpgas}} should be an immutable list



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9268) Various fixes are needed in FpgaDevice

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9268:
--

 Summary: Various fixes are needed in FpgaDevice
 Key: YARN-9268
 URL: https://issues.apache.org/jira/browse/YARN-9268
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Peter Bacsko


Need to fix the following the class FpgaDevice:

* It implements Comparable, but not Comparable, so we have a raw 
type warning. It also returns 0 in every case. There is no natural ordering 
among FPGA devices, perhaps "acl0" comes before "acl1", but this seems too 
forced and unnecessary.We think this class should not implement Comparable at 
all, at least not like that.
* Stores unnecessary fields: devName, busNum, temperature, power usage. For 
one, these are never needed in the code. Secondly, temp and power usage changes 
constantly. It's pointless to store these in this POJO.
* serialVersionUID is 1L - let's generate a number for this
* Use int instead of Integer - don't allow nulls. If major/minor uniquely 
identifies the card, then let's demand them in the constructor and don't store 
Integers that can be null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9267) Various fixes are needed in FpgaResourceHandlerImpl

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9267:
--

 Summary: Various fixes are needed in FpgaResourceHandlerImpl
 Key: YARN-9267
 URL: https://issues.apache.org/jira/browse/YARN-9267
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Peter Bacsko


Fix some problems in FpgaResourceHandlerImpl:

* preStart() does not reconfigure card with the same IP - we see it as a 
problem. If you recompile the FPGA application, you must rename the aocx file 
because the card will not be reprogrammed. Suggestion: instead of storing 
Node<->IPID mapping, store Node<->IPID hash (like the SHA-256 of the localized 
file).
* Switch to slf4j from Apache Commons Logging
* Some unused imports



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9265) FPGA plugin fails to recognize Intel PAC card

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9265:
--

 Summary: FPGA plugin fails to recognize Intel PAC card
 Key: YARN-9265
 URL: https://issues.apache.org/jira/browse/YARN-9265
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.1.0
Reporter: Peter Bacsko


The plugin cannot autodetect Intel FPGA PAC (Processing Accelerator Card).

There are two major issues.

Problem #1

The output of aocl diagnose:
{noformat}

Device Name:
acl0
 
Package Pat:
/home/pbacsko/inteldevstack/intelFPGA_pro/hld/board/opencl_bsp
 
Vendor: Intel Corp
 
Physical Dev Name   StatusInformation
 
pac_a10_f20 PassedPAC Arria 10 Platform (pac_a10_f20)
  PCIe 08:00.0
  FPGA temperature = 79 degrees C.
 
DIAGNOSTIC_PASSED

 
Call "aocl diagnose " to run diagnose for specified devices
Call "aocl diagnose all" to run diagnose for all devices
{noformat}

This generates the following error message:
{noformat}
2019-01-25 06:46:02,834 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaResourcePlugin:
 Using FPGA vendor plugin: 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin
2019-01-25 06:46:02,943 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.FpgaDiscoverer:
 Trying to diagnose FPGA information ...
2019-01-25 06:46:03,085 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerModule:
 Using traffic control bandwidth handler
2019-01-25 06:46:03,108 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl:
 Initializing mounted controller cpu at /sys/fs/cgroup/cpu,cpuacct/yarn
2019-01-25 06:46:03,139 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceHandlerImpl:
 FPGA Plugin bootstrap success.
2019-01-25 06:46:03,247 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
 Couldn't find (?i)bus:slot.func\s=\s.*, pattern
2019-01-25 06:46:03,248 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
 Couldn't find (?i)Total\sCard\sPower\sUsage\s=\s.* pattern
2019-01-25 06:46:03,251 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.fpga.IntelFpgaOpenclPlugin:
 Failed to get major-minor number from reading /dev/pac_a10_f30
2019-01-25 06:46:03,252 ERROR 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to 
bootstrap configured resource subsystems!
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException:
 No FPGA devices detected!
{noformat}

Problem #2

The plugin assume that the file name under {{/dev}} can be derived from the 
"Physical Dev Name". This is not the case. For example, it thinks that the 
device file is {{ /dev/pac_a10_f30}} which is not the case, the actual file 
is {{/dev/intel-fpga-port.0}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9266) Various fixes are needed in IntelFpgaOpenclPlugin

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9266:
--

 Summary: Various fixes are needed in IntelFpgaOpenclPlugin
 Key: YARN-9266
 URL: https://issues.apache.org/jira/browse/YARN-9266
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Peter Bacsko


Problems identified in this class:

* InnerShellExecutor ignores the timeout parameter
* configureIP() uses printStackTrace() instead of logging
* configureIP() does not log the output of aocl if the exit code != 0
* parseDiagnoseInfo() is too heavyweight -- it should be in its own class for 
better testability
* downloadIP() uses contains() for file name check -- this can really surprise 
users in some cases (eg. you want to use hello.aocx but hello2.aocx also 
matches)
* method name downloadIP() is misleading -- it actually tries to finds the 
file. Everything is downloaded (localized) at this point.
* @VisibleForTesting methods should be package private



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9264) [Umbrella] Follow-up on IntelOpenCL FPGA plugin

2019-02-01 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9264:
--

 Summary: [Umbrella] Follow-up on IntelOpenCL FPGA plugin
 Key: YARN-9264
 URL: https://issues.apache.org/jira/browse/YARN-9264
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.1.1
Reporter: Peter Bacsko


The Intel FPGA resource type support was released in Hadoop 3.1.0.

Right now the plugin implementation has some deficiencies that need to be 
fixed. This JIRA lists all problems that need to be resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9263) TestConfigurationNodeAttributesProvider fails after Mockito updated

2019-02-01 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-9263:
-

 Summary: TestConfigurationNodeAttributesProvider fails after 
Mockito updated
 Key: YARN-9263
 URL: https://issues.apache.org/jira/browse/YARN-9263
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Weiwei Yang
Assignee: Weiwei Yang


This UT is failing after HADOOP-14178



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org