Re: disposing all ndarray in a given context

2017-10-18 Thread Joern Kottmann
Have a look at this code:
https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/optimizer/AdaDelta.scala

There they have the same problem and use disposeDepsExcept to release resources.

Jörn

On Tue, Oct 17, 2017 at 4:18 PM, TongKe Xue  wrote:
> Following up to this:
>
> I see that the Scala API, when creating ndarray, uses:
>
> https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/NDArray.scala#L114
>
> which calls
>
> https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/LibInfo.scala#L42
>
> to get a "handle" from the given context.
>
>
> I've looked through the LibInfo.scala file -- and it's not clear to me
> if there is a way to:
>
> 1) nuke all handles in a Context OR
> 2) get a list of all handles in a Context (so I can manually call dispose)
>
> Is either of these things possible?
>
> Thanks!
>
>
> On Mon, Oct 16, 2017 at 4:15 PM, TongKe Xue  wrote:
>> Quoting: 
>> https://github.com/apache/incubator-mxnet/blob/master/scala-package/core/src/main/scala/ml/dmlc/mxnet/NDArray.scala#L545-L546
>>
>> * WARNING: it is your responsibility to clear this object through dispose().
>> * NEVER rely on the GC strategy
>>
>> Is there a way to say "dispose all ndarrays of this context" ?


Adam Update and its inputs/outputs

2017-10-12 Thread Joern Kottmann
Hello all,

while working on the MNIST example for the Java API I noticed that the
adam_update takes four NDArrays as input, but only declares one
output. Even tough it changes three of the NDArrays.

The inputs are weight, grad, mean and var. And the output is weight,
but I had the impression it should be weight, mean and var.

Would it make sense to change that?

Jörn


Re: Apache MXNet build failures are mostly valid - verify before merge

2017-09-29 Thread Joern Kottmann
It also makes sense to block too old PRs from merging, because the
test results are outdated and the build might fail after it gets
merged.

Jörn

On Thu, Sep 28, 2017 at 9:14 PM, Zha, Sheng  wrote:
> +1 on protected branch.
>
> Best regards,
> -sz
>
> On 9/28/17, 11:48 AM, "Kumar, Gautam"  wrote:
>
> Hi Guys,
>
>  Let’s focus on specific issue here.
>
> Marking the master branch protected which involves “Only merge if checks 
> has passed, and yes it will run the complete build”.
>
> We can’t afford to degrade the quality and keep debugging the build 
> failure forever. If it’s slow down the development at the cost of quality I 
> will vote for the quality.
> We can work on improving the infrastructure to improve the overall speed. 
>  If you have any specific concerns on availability of Jenkins please point 
> out.
>
> -Gautam
>
>
> On 9/28/17, 11:38 AM, "Chris Olivier"  wrote:
>
> -1000 on that. :)
>
> On Thu, Sep 28, 2017 at 11:33 AM Naveen Swamy  
> wrote:
>
> > PR->Sanity test/Linux build/test->reviewer/committer approves the
> > change->Comment "Build Now" (Or trigger on at least one approval 
> from a
> > committer other than author)->*Full build-*>*passes build*->Enable 
> Merge
> >
> > Let us take this particular topic to a separate thread or discuss 
> offline
> > if further clarification is needed.
> >
> > On Thu, Sep 28, 2017 at 11:24 AM, Chris Olivier 
> 
> > wrote:
> >
> > > I understand the proposal.  How to trigger a build in that case?
> > >
> > >
> > > On Thu, Sep 28, 2017 at 10:54 AM Madan Jampani 
> 
> > > wrote:
> > >
> > > > Chris,
> > > > I don't think Naveen is suggesting that a merge happen without 
> full
> > > > verification i.e. all tests across all platforms pass.
> > > > If a PR has some back and forth and results in multiple 
> revisions
> > (which
> > > is
> > > > arguably more common than a random unit test failing), we 
> simply delay
> > > full
> > > > verification until the owner/reviewer have settled on a mutually
> > > acceptable
> > > > state.
> > > >
> > > > On Thu, Sep 28, 2017 at 10:25 AM, Chris Olivier 
>  > >
> > > > wrote:
> > > >
> > > > > -1 for running only partial tests.  Most failing unit tests 
> that get
> > > > > through fail only for certain platforms/configurations.  I 
> personally
> > > > > prefer to be assured the build and test is good before merge. 
>  Most
> > PR
> > > > > merges aren't in a huge hurry.
> > > > >
> > > > > On Thu, Sep 28, 2017 at 9:54 AM, Naveen Swamy 
> 
> > > > wrote:
> > > > >
> > > > > > +1 to make it protected. Here is what I am thinking for PR 
> builds
> > > > > > on a PR run Sanity Tests + build/test one 
> platform->committer
> > reviews
> > > > the
> > > > > > code and issues "Build Now", a full build is run->Github 
> checks
> > that
> > > > the
> > > > > > full build checks succeed before it can be merged.
> > > > > >
> > > > > > I agree with Madan that PR should be approved by one another
> > > committer.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 28, 2017 at 9:37 AM, Madan Jampani <
> > > > madan.jamp...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > At a minimum I'd like to see the following two happen:
> > > > > > > - Option to merge is disabled until all required checks 
> pass.
> > > > > > > - Code is reviewed and given +1 by at least one other 
> committer
> > (no
> > > > > self
> > > > > > > review).
> > > > > > >
> > > > > > > On Wed, Sep 27, 2017 at 11:15 PM, Gautam 
> 
> > > > wrote:
> > > > > > >
> > > > > > > > Hi Chris,
> > > > > > > >
> > > > > > > >   Here 
>  > > branches/
> > > > >
> > > > > is
> > > > > > > > user
> > > > > > > > document on semantics of protected branch.
> > > > > > > > In short when a branch is protected following applies 
> to that
> > > > branch.
> > > > > > > >
> > > > > > > >- Can't be force pushed
> > > > > > > >- Can't be deleted
> > > > > > > >- Can't have changes merged into it until required 
> status
> > > checks
> > > > > > > > > > status-checks>
> > > > > > pass
> > > > > > > >- Can't have changes merged into it until required 
> reviews
>   

Re: Apache MXNet build failures are mostly valid - verify before merge

2017-09-28 Thread Joern Kottmann
At Apache OpenNLP we just established among committers that you check
that the status indicator is green before you merge,
and if it wasn't the case then we would ask the committer to take
responsibility and repair things. Works very well our build is never
broken.

We also strongly prefer if each PR gets reviewed by another committer.

Overall this works quite well. We don't use any of the protections
against merging, it is important that you can trust each of the
committers not to break things, if there are problems it is better to
resolve them with talking to each other, rather than enforcing green
status checks.

Jörn

On Thu, Sep 28, 2017 at 8:21 AM, Chris Olivier  wrote:
> +1 on that
>
> On Wed, Sep 27, 2017 at 11:15 PM Gautam  wrote:
>
>> Hi Chris,
>>
>>   Here  is
>> user
>> document on semantics of protected branch.
>> In short when a branch is protected following applies to that branch.
>>
>>- Can't be force pushed
>>- Can't be deleted
>>- Can't have changes merged into it until required status checks
>> pass
>>- Can't have changes merged into it until required reviews are approved
>><
>> https://help.github.com/articles/approving-a-pull-request-with-required-reviews
>> >
>>- Can't be edited or have files uploaded to it from the web
>>- Can't have changes merged into it until changes to files that
>> have a designated
>>code owner  have
>>been approved by that owner
>>
>>  I am sure many of us might not want to have all these but we can debate on
>> it. My main motive was to "*Can't have changes merged into it until
>> required status checks pass*"
>>
>>
>> -Gautam
>>
>>
>>
>> On Wed, Sep 27, 2017 at 11:09 PM, Chris Olivier 
>> wrote:
>>
>> > What does that mean? "Protected"? Protected from what?
>> >
>> > On Wed, Sep 27, 2017 at 11:08 PM Gautam  wrote:
>> >
>> > > Hi Chris,
>> > >
>> > >I mean make "master branch protected" of  MXNet.
>> > >
>> > > -Gautam
>> > >
>> > > On Wed, Sep 27, 2017 at 11:04 PM, Chris Olivier > >
>> > > wrote:
>> > >
>> > > > What does this mean? "Mx-net branch protected"?
>> > > >
>> > > > On Wed, Sep 27, 2017 at 9:59 PM Tsuyoshi OZAWA <
>> > ozawa.tsuyo...@gmail.com
>> > > >
>> > > > wrote:
>> > > >
>> > > > > +1,
>> > > > >
>> > > > > While I'm checking the recent build failures, and I think the
>> > decision
>> > > > > of making the mx-net branch protected is necessary for stable
>> > > > > building.
>> > > > > Thanks Kumar for resuming important discussion.
>> > > > >
>> > > > > Best regards
>> > > > > - Tsuyoshi
>> > > > >
>> > > > > On Thu, Sep 28, 2017 at 12:56 PM, Kumar, Gautam 
>> > > > wrote:
>> > > > > > Reviving the discussion.
>> > > > > >
>> > > > > > At this point of time we have couple of stable builds
>> > > > > >
>> > > > > https://builds.apache.org/view/Incubator%20Projects/job/
>> > > > incubator-mxnet/job/master/448/
>> > > > > >
>> > > > > https://builds.apache.org/view/Incubator%20Projects/job/
>> > > > incubator-mxnet/job/master/449/
>> > > > > >
>> > > > > > Should we have a quick discussion or polling on making the mx-net
>> > > > branch
>> > > > > protected? If you still think we shouldn’t make it protected please
>> > > > provide
>> > > > > a reason to support your claim.
>> > > > > >
>> > > > > > Few of us have concern over Jenkin’s stability. If I look two
>> weeks
>> > > > > back, after upgrading Linux slave to g2.8x and new windows AMI, we
>> > have
>> > > > not
>> > > > > seen any case where instance died due to high memory usage or any
>> > > process
>> > > > > got killed due to high cpu usage or any other issue with windows
>> > > slaves.
>> > > > > >
>> > > > > > Going forward we are also planning that if we add any new slave
>> we
>> > > will
>> > > > > not enable the main load immediately, but rather will do ‘test
>> build’
>> > > to
>> > > > > make sure that new slaves are not causing any infrastructure issue
>> > and
>> > > > > capable to perform as good as existing slaves.
>> > > > > >
>> > > > > > -Gautam
>> > > > > >
>> > > > > > On 8/31/17, 5:27 PM, "Lupesko, Hagay"  wrote:
>> > > > > >
>> > > > > > @madan looking into some failures – you’re right… there’s
>> > > multiple
>> > > > > issues going on, some of them intermittent, and we want to be able
>> to
>> > > > merge
>> > > > > fixes in.
>> > > > > > Agreed that we can wait with setting up protected mode until
>> > > build
>> > > > > stabilizes.
>> > > > > >
>> > > > > > On 8/31/17, 11:41, "Madan Jampani" 
>> > > > wrote:
>> > > > > >
>> > > > > > @hagay: we agree on the end state. I'm not too particular
>> > > about
>> > > > > how we get
>> > > > > > there. If you think enabling it now and fixes regression
>> > > later
>> > > > > is doable,
>> > > > > > I'm fine with. I see a bit of a chicken and egg problem.
>> We
>> > > >

Re: CI problems

2017-09-28 Thread Joern Kottmann
GitHub shows red warnings for PRs that didn't pass all the tests. You
should never merge PRs which are red, or not current anymore (this
could also be a red status indicator).
If the failing tests can't be resolved quickly it might be worth
splitting the tests between stable / unstable and have different
status indicators for these tests.
Then people can see easily if the PR is breaking the stable tests.

Jörn

On Wed, Sep 27, 2017 at 8:12 PM, Gautam  wrote:
> Hi Chris,
>
> There could be possibility that someone might have merged the changes
> without having all checks running, or overlooked the result. Currently we
> don't have any mechanism where git farm can reject such merge where CI/Unit
> test fails. I will try to explore any such possibilities. Meanwhile I would
> rather disable those fail tests with a git hub issue so that build can be
> clean for now. We can revisit those issue later.
>
>
>
>
> On Wed, Sep 27, 2017 at 11:02 AM, kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
>
>> I share in your frustration Chris. I've also spent a fair amount of time in
>> the past few days digging through console logs to try and see if there was
>> anything actionable.  I haven't noticed any tests that were failing
>> consistently, maybe you can post an issue with some specific tests?  For me
>> the larger issue is Sanity Check failures and segfaults at the end of test
>> runs (after all tests pass).  I'm assuming that people are working on the
>> Sanity Check issues.  If there's anything external contributors can do
>> please let us know.
>>
>> On Wed, Sep 27, 2017 at 5:48 PM, Chris Olivier 
>> wrote:
>>
>> > By the way, I am not referring to a few tests that are known to fail
>> 1%-10%
>> > or so of the time (ie test_batchnorm_training) and are being actively
>> > worked on. I am referring to tests that fail 100% of the time and are
>> still
>> > merged into master, and thus propagate to all branches when sync'd from
>> > master.
>> >
>> > On Wed, Sep 27, 2017 at 8:43 AM, Chris Olivier 
>> > wrote:
>> >
>> > > How are so many broken unit tests getting into master?  Is stuff being
>> > > merged without passing CI/unit testing?  I have been trying to get
>> three
>> > > PR's to build for over a week now.  Each time it's some broken test or
>> > > another that has nothing to do with my code changes.  It's extremely
>> > > frustrating -- I waste whole days on this, trying to figure out why my
>> > code
>> > > is breaking strange things only to realize later it's broken in all
>> > > branches.
>> > >
>> >
>>
>
>
>
> --
> Best Regards,
> Gautam Kumar


Re: What's everyone working on?

2017-09-25 Thread Joern Kottmann
Hello all,

I am working on the Java API and frequently update my jvm-package branch here:
https://github.com/kottmann/mxnet/commits/jvm-package

Currently I focus on NDArray and Symbol/Executor, my short term goal
is to get the MNIST sample running.

Anyone interested to help out?

There are many more APIs that have to be implemented and we need to
find some way to do testing effectively in non-python APIs.

Jörn


On Mon, Sep 25, 2017 at 8:23 PM, Seb Kiureghian  wrote:
> Hey dev@,
>
> In the spirit of bringing more activity to the mailing lists and growing
> the community, can everyone who is working on MXNet please share what
> you're working on?
>
> I'm working on
> -Redesigning the website
> .
> -Setting up a forum for user support.
>
> Seb Kiureghian


Re: MXNet: Run PR builds on Apache Jenkins only after the commit is reviewed

2017-09-12 Thread Joern Kottmann
Not sure how it works with jenkins, but other CI serves can look at
the commit message and skip the CI run based on certain commands in
it.

Might make sense for small changes such as documentation updates, half
done PRs, etc.

Jörn

On Tue, Sep 12, 2017 at 11:17 AM, Larroy, Pedro  wrote:
> Hi
>
> I would like to integrate our CI system for devices to make sure PRs build on 
> ARM / android etc. Who has admin rights on the repository so we can install 
> the necessary hooks to trigger our builds?
>
>
> Kind regards.
> --
>
> Pedro
>
> On 12/09/17 02:50, "Meghna Baijal"  wrote:
>
> Hi All,
> We would like to initiate a change in the way the PR builds are being 
> triggered. At the moment, every time a Pull Request is created, a build gets 
> triggered on Jenkins. Additional builds also get triggered due to changes to 
> the same PR.
> Too many PR builds leads to resource starvation and very long queues and 
> long build times. Hence we would like to add some checks where a human 
> reviewer manually marks it to something like “ok to build” before a PR build 
> is triggered.
>
> Do you think this approach would be helpful and we should move forward 
> with it?
>
> Thanks,
> Meghna Baijal
>
>
>
>
>
>
> Amazon Development Center Germany GmbH
> Berlin - Dresden - Aachen
> main office: Krausenstr. 38, 10117 Berlin
> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
> Ust-ID: DE289237879
> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


Re: Jira and/or feature collision

2017-09-11 Thread Joern Kottmann
A good way to coordinate is to use the dev list and/or state on the
issues that you are interested in it and are working on it.

We use Jira for years at Apache OpenNLP for all the issues we deal
with. At GitHub we use the Pull Request feature and synchronize all
comments with the corresponding Jira issue.

My opinion is that Jira does much more then we actually use/need, and
is sometimes slightly unresponsive (e.g. takes 1 or 2 seconds to load
a new page) on the other hand, the GitHub issue tracker ist too simple
and is missing a few useful features.

Jörn

On Mon, Sep 11, 2017 at 10:18 AM, Henri Yandell  wrote:
> Yes, Apache JIRA is free to use.
>
> My observations of GitHub are that roadmaps/wishlist features need better
> separation from bug reports. Ideally you want a nice big list of ideas for
> future work, and a list of bug reports and smaller contributions that
> you're always driving down to zero. One way to do that could be to put the
> new features in JIRA, while keeping GitHub for bug reports, not sure if
> that's what you were getting to Chris with the question.
>
> Hen
>
>
> On Sun, Sep 10, 2017 at 9:23 PM, sandeep krishnamurthy <
> sandeep.krishn...@gmail.com> wrote:
>
>> +1
>> Thanks Chris for bringing up this important topic.
>>
>> I would really like to prioritize this topic and request users and mentors
>> to come up a process or suggestions on how to:
>> 1. Request for contributions from the community.
>> 2. A community member raising feature requests.
>> 3. A community member ready to contribute a feature or bug fix.
>> 4. A community member actively proposing and driving a big new feature for
>> the project.
>>
>> Projects in Github, Tagging Github issues with Call for Contributions may
>> seem very straight forward approach. But, is there any other suggestions or
>> standard practice to drive such efforts?
>>
>> This will go a long way in keeping community members informed about what
>> next in the project, how can they be part and how they can set future
>> directions in the project. Also, saving the time and effort in duplication
>> of efforts.
>>
>> Regards,
>> Sandeep
>>
>> On Sat, Sep 9, 2017 at 4:48 AM, Chris Olivier 
>> wrote:
>>
>> > Is Apache JIRA free to use? What do most projects use? While it's natural
>> > that some companies have internal priorities which drive their
>> development
>> > plans, how do other Apache projects avoid having the same feature
>> developed
>> > independently by more than one party, because they isn't know the other
>> was
>> > working on it already?  Or coordinate forces (so to speak) on a large
>> > feature or initiative?
>> >
>> > -Chris
>> >
>>
>>
>>
>> --
>> Sandeep Krishnamurthy
>>


Re: [VOTE] Release MXNet version 0.11.0.rc2

2017-08-17 Thread Joern Kottmann
I downloaded the src distribution file and noticed the following things:
- There are many LICENSE files, and the top-level file doesn't contain
all licenses,  you should consolidate all those LICENSE files and
place only one at the top level of the source tree, see here for
instructions [1].
- The distribution contains a .DS_Store file
- The distribution contains the .git folder

Jörn

[1] http://www.apache.org/dev/licensing-howto.html#assembling-license-and-notice

On Thu, Aug 17, 2017 at 10:11 AM, Joern Kottmann  wrote:
> The release tag can be found here (and should be included in the vote mail):
> https://github.com/apache/incubator-mxnet/tree/0.11.0.rc2
>
> Jörn
>
> On Thu, Aug 17, 2017 at 2:38 AM, Meghna Baijal
>  wrote:
>> This is the vote to release Apache MXNet (incubating) version 0.11.0.
>> Voting will start now (Thursday, August 17, 2017 12:37 AM UTC) and
>> close Monday, August 21, 2017 12:37 AM UTC.
>>
>> Link to release notes:
>> https://cwiki.apache.org/confluence/display/MXNET/v0.11.0+Release+Notes 
>> <https://cwiki.apache.org/confluence/display/MXNET/v0.11.0+Release+Notes>
>>
>> Link to release candidate 0.11.0.rc2:
>> https://dist.apache.org/repos/dist/dev/incubator/mxnet/0.11.0.rc2/ 
>> <https://dist.apache.org/repos/dist/dev/incubator/mxnet/0.11.0.rc2/>
>>
>> View this page and scroll down to “Build from Source” to build this project:
>> http://mxnet.incubator.apache.org/get_started/install.html 
>> <http://mxnet.incubator.apache.org/get_started/install.html>
>>
>> Changes between rc1 and rc2:
>> 1. Remove WaitToRead in dist-kvstore
>>
>> Major Features in v0.11:
>> 1. CoreML Converter: 
>> https://github.com/apache/incubator-mxnet/blob/master/tools/coreml/README.md 
>> <https://github.com/apache/incubator-mxnet/blob/master/tools/coreml/README.md>
>> 2. Keras 1.2.2 Support: https://github.com/dmlc/keras/wiki/Installation 
>> <https://github.com/dmlc/keras/wiki/Installation>
>>
>>
>>
>> Please make sure you TEST before you vote accordingly:
>>
>> +1 = approve
>>
>> +0 = no opinion
>>
>> -1 = disapprove (provide reason)


Re: [VOTE] Release MXNet version 0.11.0.rc2

2017-08-17 Thread Joern Kottmann
The release tag can be found here (and should be included in the vote mail):
https://github.com/apache/incubator-mxnet/tree/0.11.0.rc2

Jörn

On Thu, Aug 17, 2017 at 2:38 AM, Meghna Baijal
 wrote:
> This is the vote to release Apache MXNet (incubating) version 0.11.0.
> Voting will start now (Thursday, August 17, 2017 12:37 AM UTC) and
> close Monday, August 21, 2017 12:37 AM UTC.
>
> Link to release notes:
> https://cwiki.apache.org/confluence/display/MXNET/v0.11.0+Release+Notes 
> 
>
> Link to release candidate 0.11.0.rc2:
> https://dist.apache.org/repos/dist/dev/incubator/mxnet/0.11.0.rc2/ 
> 
>
> View this page and scroll down to “Build from Source” to build this project:
> http://mxnet.incubator.apache.org/get_started/install.html 
> 
>
> Changes between rc1 and rc2:
> 1. Remove WaitToRead in dist-kvstore
>
> Major Features in v0.11:
> 1. CoreML Converter: 
> https://github.com/apache/incubator-mxnet/blob/master/tools/coreml/README.md 
> 
> 2. Keras 1.2.2 Support: https://github.com/dmlc/keras/wiki/Installation 
> 
>
>
>
> Please make sure you TEST before you vote accordingly:
>
> +1 = approve
>
> +0 = no opinion
>
> -1 = disapprove (provide reason)


Re: Java API for MXNet

2017-08-16 Thread Joern Kottmann
With Java API I mean a set of classes I can use from Java, I tried
this with the current Scala API but wasn't very successful. Probably
if you know a bit about Scala internals, you can figure it all out but
this makes it kind of unpleasant to use. You don't necessarily need to
write Java code to built a Java API, so you can also write Scala code
and sticking to certain rules to make it callable from Java code
without magic tricks.

So yeah, maybe we should just take a look at the Scala API, come up
with a list of things that are difficult when used from Java code and
see how it can be improved. That approach probably at least gives you
the advantages mentioned here before, quick to do, no duplication,
etc.

Afterwards we could still work on an approach for Java which goes
beyond "build a Scala API wrapper".

If you look at the quick wins, maybe a good approach would just be to
give the following advice to people who need to access MXNet from Java
code:
- Integrate MXNet with custom Scala code
- Use a maven/gradle build to create a module  of the integration
which can be called from your Java code

Jörn



On Wed, Aug 16, 2017 at 10:02 PM, Nan Zhu  wrote:
> Hi, Joern,
>
> when you say "Java API " it's sharing scala impl or not?
>
> Best,
>
> Nan
>
> On Wed, Aug 16, 2017 at 12:46 PM, Joern Kottmann  wrote:
>
>> Seems like we are all agree about the idea to add a Java API.
>>
>> Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP
>> use case) to use the Java API when it requires a Scala dependency,
>> because at that point I would be better of just using the Scala API,
>> and ensure that the things I build are compatible with Java.
>>
>> So if I don't want to add Scala as a dependency then I am better off
>> building something on top of a generated JNI layer. As far as I can
>> tell from my tests with the scala-package you can get quite far with
>> MXNet using NDArray and the Symbol API.
>>
>> Maybe we could work on this from two sides as described by Pracheer.
>> If we have a well defined Java API you could look at the work I have
>> done by then and see how it can be plugged in or what can be learnt
>> from it.
>>
>> Jörn
>>
>> On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu  wrote:
>> > +1 for Sandeep's suggestion
>> >
>> > On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu  wrote:
>> >
>> >> Agree with Sandeep, while I guess the performance won't change. But
>> >> yes, benchmark talks.
>> >>
>> >> Moreover, in Scala package we use macros to generate operators
>> >> automatically, which will require more efforts if we switch to pure
>> >> Java.
>> >>
>> >> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <
>> >> sandeep.krishn...@gmail.com>:
>> >> > The fastest way to get Java binding is through building Java native
>> >> > wrappers on Scala package.
>> >> > Disadvantages would be:
>> >> >* *Bloated library size: *May not be suitable for users planning to
>> >> use
>> >> > Java APIs in Android of such smaller systems.
>> >> >* *Performance:* Performance may not be as good as building
>> directly
>> >> > over JNI and implementing ground up. For example, taking NDArray
>> >> dimensions
>> >> > as Java ArrayList then converting it to Scala Seq to adapt for Scala
>> >> > NDArray API and more such adapters.
>> >> >
>> >> > However, building ground up from JNI would be a huge effort without
>> >> > actually getting feedback from users early.
>> >> >
>> >> > *My Plan:*
>> >> > 1. Build Java interface on top of Scala package.
>> >> > 2. Get early feedback from users. It may turn out Java is not a great
>> >> > candidate for DL training jobs.
>> >> > 3. Solidify the interface (APIs) for Java users.
>> >> > 4. Do performance benchmarks to see Scala Native / Java interface.
>> This
>> >> > gives us comparable numbers on performance in Java.
>> >> > 5. Over a period of time replace underlying Scala usage with JNI base
>> and
>> >> > native Java implementation. Provided feedback from users is positive.
>> >> >
>> >> > Comments/Suggestion?
>> >> >
>> >> > Regards,
>> >> > Sandeep
>> >> >
>> >> >
>> >> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu 
>> wrote:
>>

Re: Java API for MXNet

2017-08-16 Thread Joern Kottmann
Seems like we are all agree about the idea to add a Java API.

Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP
use case) to use the Java API when it requires a Scala dependency,
because at that point I would be better of just using the Scala API,
and ensure that the things I build are compatible with Java.

So if I don't want to add Scala as a dependency then I am better off
building something on top of a generated JNI layer. As far as I can
tell from my tests with the scala-package you can get quite far with
MXNet using NDArray and the Symbol API.

Maybe we could work on this from two sides as described by Pracheer.
If we have a well defined Java API you could look at the work I have
done by then and see how it can be plugged in or what can be learnt
from it.

Jörn

On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu  wrote:
> +1 for Sandeep's suggestion
>
> On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu  wrote:
>
>> Agree with Sandeep, while I guess the performance won't change. But
>> yes, benchmark talks.
>>
>> Moreover, in Scala package we use macros to generate operators
>> automatically, which will require more efforts if we switch to pure
>> Java.
>>
>> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy <
>> sandeep.krishn...@gmail.com>:
>> > The fastest way to get Java binding is through building Java native
>> > wrappers on Scala package.
>> > Disadvantages would be:
>> >* *Bloated library size: *May not be suitable for users planning to
>> use
>> > Java APIs in Android of such smaller systems.
>> >* *Performance:* Performance may not be as good as building directly
>> > over JNI and implementing ground up. For example, taking NDArray
>> dimensions
>> > as Java ArrayList then converting it to Scala Seq to adapt for Scala
>> > NDArray API and more such adapters.
>> >
>> > However, building ground up from JNI would be a huge effort without
>> > actually getting feedback from users early.
>> >
>> > *My Plan:*
>> > 1. Build Java interface on top of Scala package.
>> > 2. Get early feedback from users. It may turn out Java is not a great
>> > candidate for DL training jobs.
>> > 3. Solidify the interface (APIs) for Java users.
>> > 4. Do performance benchmarks to see Scala Native / Java interface. This
>> > gives us comparable numbers on performance in Java.
>> > 5. Over a period of time replace underlying Scala usage with JNI base and
>> > native Java implementation. Provided feedback from users is positive.
>> >
>> > Comments/Suggestion?
>> >
>> > Regards,
>> > Sandeep
>> >
>> >
>> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu  wrote:
>> >
>> >> What Nan and I worried about is the re-implementation of something
>> >> like https://github.com/apache/incubator-mxnet/blob/master/
>> >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246,
>> >> and the executorManager, NDArray, KVStore ... it uses.
>> >>
>> >> the C API stays at the very low level. If this is the purpose, we can
>> >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without
>> >> scala, no need to introduce JavaCPP. But I don't think this is what
>> >> users want.
>> >>
>> >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann :
>> >> > There will be a new scala version one day, and the story we had with
>> >> > going from 2.10 to 2.11 might just repeat. In the end if you make a
>> >> > dependency using scala you just end up making it for the currently
>> >> > popular scala versions. And that might be ok for projects with
>> >> > developers who are familiar with these issues, but it is not ok for
>> >> > java projects, where people might not expect it or know about these
>> >> > problems. It just makes it harder to use.
>> >> >
>> >> > To me it looks like that the C API is very stable and used by all/most
>> >> > other APIs. If we have a Java API - accessing the C API via JavaCPP -
>> >> > then we should end up with a pretty stable solution and a lot the code
>> >> > that is duplicated with the Scala API is the generated code.
>> >> >
>> >> > I think we should explore this possible way of implementing it with a
>> >> > proof-of-concept.
>> >> >
>> >> > And if we have a well made Java API it might be something which maybe
>> >&g

Re: Java API for MXNet

2017-08-16 Thread Joern Kottmann
There will be a new scala version one day, and the story we had with
going from 2.10 to 2.11 might just repeat. In the end if you make a
dependency using scala you just end up making it for the currently
popular scala versions. And that might be ok for projects with
developers who are familiar with these issues, but it is not ok for
java projects, where people might not expect it or know about these
problems. It just makes it harder to use.

To me it looks like that the C API is very stable and used by all/most
other APIs. If we have a Java API - accessing the C API via JavaCPP -
then we should end up with a pretty stable solution and a lot the code
that is duplicated with the Scala API is the generated code.

I think we should explore this possible way of implementing it with a
proof-of-concept.

And if we have a well made Java API it might be something which maybe
wouldn't need a lot of additions to be pleasurable to use from scala.

Jörn

On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu  wrote:
> I don't think there will be problems under "11", did the user see concrete
> errors?
>
> Best,
>
> Nan
>
>
>
> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu  wrote:
>
>> Hi Nan,
>>
>> Users have 2.11, but with a different minor version, will it cause
>> conflicts?
>>
>> 2017-08-17 0:19 GMT+08:00 Nan Zhu :
>> > Hi, Yizhi,
>> >
>> > You mean users have 2.10 env while we assemble 2.11 in it?
>> >
>> > Best,
>> >
>> > Nan
>> >
>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu  wrote:
>> >
>> >> Hi Joern,
>> >>
>> >> The point is that, the front is not a simple wrapper of c_api.h, as
>> >> you mentioned, which can be easily achieved by JavaCPP.
>> >>
>> >> I have noticed the potential conflicts between the assembled scala
>> >> library and the one in users' environment. Can we remove the scala
>> >> library from the assembly jar? @Nan It wouldn't be a problem since the
>> >> scala libraries with same major version are compatible.
>> >>
>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann :
>> >> > Hello,
>> >> >
>> >> > I personally had quite some issues with Scala dependencies in
>> >> > different versions and Spark, where one version is not compatible with
>> >> > the other version. Then you need to debug the dependency tree to find
>> >> > the places where the versions don't match. Every project which would
>> >> > like to use MXnet then has to depend on Scala and might also get
>> >> > conflicts if other dependencies depend on different Scala versions.
>> >> > Probably something which will cause issues for some of your users.
>> >> > Users who want to use Java might not be familiar with Scala dependency
>> >> > problems and have a hard time resolving them by getting strange error
>> >> > messages.
>> >> >
>> >> > The JNI layer could be generated with JavaCPP, then we would not need
>> >> > to write/maintain the C and the  jvm side for that our self.
>> >> > A good example of JavaCPP and Scala usage is Apache Mahout [1].
>> >> >
>> >> > Even if we don't use JavaCPP, the JNI layer should be easy to get into
>> >> > a state where both can share it, the current Scala JNI layers LibInfo
>> >> > classes could be converted to Java classes and would in most cases
>> >> > require only minor changes in the Scala code.
>> >> >
>> >> > Jörn
>> >> >
>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/src/main
>> >> >
>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu 
>> wrote:
>> >> >> I agree with Yizhi
>> >> >>
>> >> >> My major concern is the duplicate implementations, which are usually
>> >> one of
>> >> >> the major sources of bugs, especially with two languages which are
>> >> >> naturally interactive (OK, Calling Scala from Java might need some
>> more
>> >> >> efforts). It is just like we provide C++ & C APIs of MxNet in two
>> >> separated
>> >> >> packages.
>> >> >>
>> >> >> About dependency problem, when you say "As far as I see this has the
>> >> great
>> >> >> disadvantage that the Java API would force Scala as a dependency onto
>> >> the
>> &

Re: Java API for MXNet

2017-08-16 Thread Joern Kottmann
Hello,

I personally had quite some issues with Scala dependencies in
different versions and Spark, where one version is not compatible with
the other version. Then you need to debug the dependency tree to find
the places where the versions don't match. Every project which would
like to use MXnet then has to depend on Scala and might also get
conflicts if other dependencies depend on different Scala versions.
Probably something which will cause issues for some of your users.
Users who want to use Java might not be familiar with Scala dependency
problems and have a hard time resolving them by getting strange error
messages.

The JNI layer could be generated with JavaCPP, then we would not need
to write/maintain the C and the  jvm side for that our self.
A good example of JavaCPP and Scala usage is Apache Mahout [1].

Even if we don't use JavaCPP, the JNI layer should be easy to get into
a state where both can share it, the current Scala JNI layers LibInfo
classes could be converted to Java classes and would in most cases
require only minor changes in the Scala code.

Jörn

[1] https://github.com/apache/mahout/tree/master/viennacl/src/main

On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu  wrote:
> I agree with Yizhi
>
> My major concern is the duplicate implementations, which are usually one of
> the major sources of bugs, especially with two languages which are
> naturally interactive (OK, Calling Scala from Java might need some more
> efforts). It is just like we provide C++ & C APIs of MxNet in two separated
> packages.
>
> About dependency problem, when you say "As far as I see this has the great
> disadvantage that the Java API would force Scala as a dependency onto the
> java users.", would you please give a concrete example causing critical
> issues?
>
> Best,
>
> Nan
>
>
>
> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu  wrote:
>
>> Hi,
>>
>> If we build the Java API from the very beginning, i.e. the JNI part,
>> we have to rewrite the codes for training, predict, inferShape, etc.
>> It would be too heavy to maintain a totally new front language.
>>
>> As far as I see, I don't think Scala library dependency would be a big
>> problem in most cases, unless we are going to use it in embedded
>> devices. Could you illustrate some use-cases where you cannot involve
>> Scala dependencies?
>>
>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann :
>> > Hello,
>> >
>> > the approach which is taken by Spark is described here [1].
>> >
>> > As far as I see this has the great disadvantage that the Java API
>> > would force Scala as a dependency onto the java users.
>> > For a library it is always a great advantage if it doesn't have many
>> > dependencies, or zero dependencies. In our case it could be quite
>> > realistic to have a thin wrapper around the C API without needing any
>> > other dependencies (or only dependencies which can't be avoided).
>> >
>> > The JNI layer could easily be shared between the Java and Scala API.
>> > As far as I understand is the JNI layer in the Scala API anyway
>> > private and a change to it wouldn't require that the public part of
>> > the Scala API is changed.
>> >
>> > What do you think?
>> >
>> > Jörn
>> >
>> > [1] https://cwiki.apache.org/confluence/display/SPARK/Java+API+Internals
>> >
>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu  wrote:
>> >> Hi Joern,
>> >>
>> >> I suggest to build Java API as a wrapper of Scala API, re-use most of
>> >> the procedures. Referring to the Java API in Apache Spark.
>> >>
>> >> 2017-08-16 18:21 GMT+08:00 Joern Kottmann :
>> >>> Hello all,
>> >>>
>> >>> I would like to propose the addition of a Java API to MXNet.
>> >>>
>> >>> There has been some previous work done for the Scala API, and it makes
>> >>> sense to at least share the JNI layer between the two.
>> >>>
>> >>> The Java  API probably should be aligned with the Python API (and
>> >>> others which exist already) with a few changes to give it a native
>> >>> Java feel.
>> >>>
>> >>> As far as I understand there are multiple people interested to work on
>> >>> this and it would be good to maybe come up with a written proposal on
>> >>> how things should be.
>> >>>
>> >>> My motivation is to get a Java API which can be used by Apache OpenNLP
>> >>> to solve various NLP tasks using Deep Learning based approaches and I
>> >>> am also interested to work on MXNet.
>> >>>
>> >>> Jörn
>> >>
>> >>
>> >>
>> >> --
>> >> Yizhi Liu
>> >> DMLC member
>> >> Technical Manager
>> >> Qihoo 360 Inc, Shanghai, China
>>
>>
>>
>> --
>> Yizhi Liu
>> DMLC member
>> Technical Manager
>> Qihoo 360 Inc, Shanghai, China
>>


Re: Java API for MXNet

2017-08-16 Thread Joern Kottmann
Hello,

the approach which is taken by Spark is described here [1].

As far as I see this has the great disadvantage that the Java API
would force Scala as a dependency onto the java users.
For a library it is always a great advantage if it doesn't have many
dependencies, or zero dependencies. In our case it could be quite
realistic to have a thin wrapper around the C API without needing any
other dependencies (or only dependencies which can't be avoided).

The JNI layer could easily be shared between the Java and Scala API.
As far as I understand is the JNI layer in the Scala API anyway
private and a change to it wouldn't require that the public part of
the Scala API is changed.

What do you think?

Jörn

[1] https://cwiki.apache.org/confluence/display/SPARK/Java+API+Internals

On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu  wrote:
> Hi Joern,
>
> I suggest to build Java API as a wrapper of Scala API, re-use most of
> the procedures. Referring to the Java API in Apache Spark.
>
> 2017-08-16 18:21 GMT+08:00 Joern Kottmann :
>> Hello all,
>>
>> I would like to propose the addition of a Java API to MXNet.
>>
>> There has been some previous work done for the Scala API, and it makes
>> sense to at least share the JNI layer between the two.
>>
>> The Java  API probably should be aligned with the Python API (and
>> others which exist already) with a few changes to give it a native
>> Java feel.
>>
>> As far as I understand there are multiple people interested to work on
>> this and it would be good to maybe come up with a written proposal on
>> how things should be.
>>
>> My motivation is to get a Java API which can be used by Apache OpenNLP
>> to solve various NLP tasks using Deep Learning based approaches and I
>> am also interested to work on MXNet.
>>
>> Jörn
>
>
>
> --
> Yizhi Liu
> DMLC member
> Technical Manager
> Qihoo 360 Inc, Shanghai, China


Java API for MXNet

2017-08-16 Thread Joern Kottmann
Hello all,

I would like to propose the addition of a Java API to MXNet.

There has been some previous work done for the Scala API, and it makes
sense to at least share the JNI layer between the two.

The Java  API probably should be aligned with the Python API (and
others which exist already) with a few changes to give it a native
Java feel.

As far as I understand there are multiple people interested to work on
this and it would be good to maybe come up with a written proposal on
how things should be.

My motivation is to get a Java API which can be used by Apache OpenNLP
to solve various NLP tasks using Deep Learning based approaches and I
am also interested to work on MXNet.

Jörn