[NOTIFICATION] CI Restart

2020-01-31 Thread Anirudh Subramanian
Hi,

We had to restart the master to mitigate an issue related to jenkins slaves
being down.
You may have to retrigger some of your in progress PRs. Apologies for the
inconvenience caused.

Anirudh


[NOTIFICATION] CI Upgrade

2020-01-30 Thread Anirudh Subramanian
Hi,

I had to upgrade the CI to obtain some important security fixes :
https://jenkins.io/security/advisory/2020-01-29/ . You may have to
retrigger some of your in progress PRs. Apologies for the inconvenience
caused.

Anirudh


Re: [apache/incubator-mxnet] [RFC] MXNet Multithreaded Inference Interface (#16431)

2019-12-05 Thread Anirudh Subramanian
Thanks for the thoughtful and valuable comments @arcadiaphy.

> I've deployed many models with scala API, and run them in multiple threads. 
> The whole system has run smoothly in production environment for more than 2 
> months.

> The backend of inference is graph executor, which is created for each thread 
> with shared model parameters. The executors can be dynamically reshaped in 
> each thread independently according to the shape of the data input.

Yes, if I am not mistaken this is very similar to how the C Predict API 
supports multi threaded inference today.

> Like what's mentioned above, the dependency engine is not thread safe, so if 
> you run it in threaded engine, dead lock and core dump will happen. 
> Therefore, naive engine is the only option left. Without the dependency 
> scheduling, any write dependency on model parameters is likely to be executed 
> simultaneously and mess the internal data. If mkldnn is used to accelerate 
> inference, you will get non-deterministic results per inference because mxnet 
> stealthily reorder the data in ndarray (write dependency involved) for mkldnn 
> operators. I've used a temporary method to address this issue which is not 
> suitable for an official PR.

This is a very useful point. In my proposal, I was concentrating mostly on 
ThreadedEngine and not NaiveEngine. Though, recently I added tests for 
NaiveEngine in my PR and everything seemed to be working fine. Till now I have 
not been able to reproduce the correctness issue that you mention with MKLDNN 
(hidden write) and NaiveEngine, but it could be because the Reorder doesnt 
happen in the spawned thread. Here is my test: 
https://github.com/apache/incubator-mxnet/pull/16654/files#diff-1335fbaf3930b1438d9be18edb07a1a6R1384
 . Not sure, if something changed with MKLDNN 1.0 or my test doesnt catch that 
use case, will dig more into this. 


> Multithreaded inference should be used with caution. Sharing model parameters 
> can reduce the memory footprint in your program, but a lot of memory usage is 
> consumed by global resources (temporary workspace, random number generator, 
> ...) or op cache for mkldnn which are stored in static thread_local 
> variables. So thread number is the most important factor for memory 
> footprint, any thread involving mxnet operation, be it any trivial imperative 
> invoking of operators, will incur memory overhead by creating its own set of 
> thread_local variables. I've spent so much time tracking down memory leak and 
> the best solution is to limit thread number.

> A new method to do multithreaded inference by threaded engine is much 
> welcomed here. It will solve the above issues automatically and ensure result 
> correctness by enforcing dependency checking.

Yes, the earlier approach which has one graph executor per thread, may have a 
lot of memory consumption for global resources. Sharing the cached op will 
alleviate the pain. As you know, we still have a lot of customers using graph 
executor as the backend. Would be a great add, if you are interested to 
contribute towards making graph executor also thread safe for inference use 
cases.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16431#issuecomment-562335146

Re: [apache/incubator-mxnet] [RFC] MXNet Multithreaded Inference Interface (#16431)

2019-10-23 Thread Anirudh Subramanian
@ptrendx I am trying to open a PR by Friday. On the status : the two prereqs 
issues https://github.com/dmlc/dmlc-core/pull/573 and 
https://github.com/apache/incubator-mxnet/issues/16434 have been better 
understood and fixed/worked around. I have made C API and backend changes and 
currently still testing it. 

Because of time and resource constraints I won't be able to add the CPP 
frontend changes (which has been mentioned in this PR as targeted for 1.6) in 
this proposal but only C API changes, backend changes and tests/verification.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16431#issuecomment-545741769

Re: Join the dev community

2019-10-18 Thread Anirudh Subramanian
Hi Akash,

Welcome to the project! https://mxnet.apache.org/community/contribute is a
good place to start.

Anirudh

On Fri, Oct 18, 2019 at 6:37 AM AKASH S M  wrote:

> Hello,
>  I'm Akash S M, an undergraduate from Indian Institute of
> Technology, Roorkee. I'd like to join the developer community and
> contribute to the project.
>
> regards,
> Akash S M
>


Re: [apache/incubator-mxnet] [RFC] MXNet Multithreaded Inference Interface (#16431)

2019-10-10 Thread Anirudh Subramanian
Thanks @marcoabreu ! 

> Will the new C-API functions be threadsafe in general? Speak, I can invoke 
> them at any point in time from any thread without the need of a lock, 
> sticky-thread or a thread hierarchy? (I'm thinking of the thread-safety being 
> done on the backend level)

The issue I found with C API thread safety especially with the cached op use 
case was the ThreadLocalStore. If we fix this issue then C APIs related to 
CreateCachedOp and InvokeCachedOp should be threadsafe.

>  Will this also support the GPU use-case? Speak, the parameters are only 
> copied into GPU memory once in the same fashion as you're describing for the 
> CPU?

This should still support the single GPU use-case for 1.6. Multi GPU inference 
use case requires more verification at the cached op level .

> Do you think there's a path forward to make all inference-related C-APIs 
> threadsafe instead of splitting off another execution branch?

I don't think we have such a strict split between inference and training APIs 
at the C API level. For example for gluon cached op we call InvokeCachedOp for 
both training and Inference.

But if I rephrase your question to:
Will I be able to do multi threaded inference from every frontend API which I 
can use to do inference today ?  
Right now, I am targeting only gluon since most users have been directed 
towards gluon. The other ways are using module, symbolic and using C Predict 
API. To support these two frontend APIs requires the graph executor to be 
thread safe.  This would definitely be a great add for MXNet since it would 
ensure that they can do multi-threaded inference from any of these APIs in 
MXNet, but not something I have planned for currently.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16431#issuecomment-540834971

[apache/incubator-mxnet] [RFC] MXNet Multithreaded Inference Interface (#16431)

2019-10-10 Thread Anirudh Subramanian
Thanks to @nswamy for his inputs and design discussions related to this project 
and @frankfliu for explaining the requirements and the use case from customer 
perspective.

# Problem Statement

One of the big un-catered for use cases in MXNet is loading a model and being 
able to run parallel inference on the model from multiple threads while sharing 
the parameters. There are multiple user requests for the same 
[[1]](https://github.com/apache/incubator-mxnet/issues/3946). There also has 
been a lot of confusion around the current state of MXNet with respect to 
thread safety.

This doc attempts to address three things : 

1. Tries to clarify the current state of MXNet with respect to thread safety.
2. Tries to give an idea of the benefits to expect from adding this feature.
3. Attempts to solve the problem of parallel inference by providing a 
multi-threaded inference API ( C APIs and frontend APIs in CPP and Python), 

# Current State of MXNet Thread Safety

## MXNet Dependency Engine Thread Safety

Examining MXNet dependency engine code, it looks like it was designed  to be 
thread safe. Tried to push Convolution op from multiple threads into MXNet 
Engine, to see if there are any issues with thread safety. Used CPP Package for 
the same. The script is provided here : 
https://github.com/anirudh2290/mxnet/tree/multithreaded_inference_poc/cpp-package/example/multithreading_engine_push_mxnet_op.cpp

```
./build/cpp-package/example/multithreading_engine_push_mxnet_op 2
```

The script pushes Convolution op to the engine from multiple threads. You can 
verify the correctness of the op with this script : 
https://github.com/anirudh2290/mxnet/tree/multithreaded_inference_poc/test_cached_op_ts_check.py

```
python3 test_cached_op_ts_check.py
```

## MXNet Graph Executor Thread Safety

Removed NaiveEngine only restriction for C Predict API and tried to run multi 
threaded inference with C Predict API using ThreadedEngine by commenting the 
check : 
https://github.com/anirudh2290/mxnet/tree/multithreaded_inference_poc/src/c_api/c_predict_api.cc

When running this example the program core dumps with memory leaks in Graph 
Executor Bind. This shows that graph executor is not thread safe. 

## Cached Op (Gluon Backend) Thread Safety

Try to create cached op in the main thread and spawn multiple threads to invoke 
the same cached op inside each of the threads. Here is the script which does 
the same : 
https://github.com/anirudh2290/mxnet/tree/multithreaded_inference_poc/cpp-package/example/multithreading_engine_push_cached_op.cpp

```
# Usage
./build/cpp-package/example/multithreading_engine_push_cached_op  
 

# Example
./build/cpp-package/example/multithreading_engine_push_cached_op 20 cpu 0 // 
uses cached op available in master
```

Multiple failures seen when I run this: one is in the dmlc ThreadLocalStore 
[[2]](https://github.com/dmlc/dmlc-core/issues/571),  other is in MXPlanMemory, 
retrieving forward_ref_count attribute. These errors are because of race 
condition w.r.t reading and writing of shared states in CachedOp.

# Proposed Solution

### Additions (Prioritized for 1.6)

Proposing to add a minimal thread safe cached op for inference which will be 
the following :
1. Similar to cached op, except it supports only inference use cases. 
2. Doesn’t support inlining, dynamic shapes, bulking, static alloc. 
3. Use static thread_local variables for GraphInfo which maintains the 
fwd_graph state, buff which maintains all ndarray states and for op_states. [ 
There is scope for additional optimization here w.r.t separation of buffers for 
inputs and params]
4. The above addition means that we can instantiate only one thread safe cached 
op per process. The frontend API for SymbolBlockThreadSafe needs to be a 
singleton because of this limitation.

### C API Changes (Prioritized for 1.6)

Adding a new thread_safe flag for MXCreateCachedOpEx. When set to true this 
should create a thread_safe cached op instead of a cached op.

```
  /*!
   * \brief create cached operator
   */
  MXNET_DLL int MXCreateCachedOpEx(SymbolHandle handle,
   int num_flags,
   const char** keys,
   const char** vals,
   CachedOpHandle *out,
   bool thread_safe = false);
```

Add similar thread_safe flag flags to Invoke and Free to invoke thread safe 
cached op versions instead of the default versions. 

```
  /*!
   * \brief invoke a cached op
   * \param handle the handle to the cached op
   * \param num_inputs number of input NDArrays
   * \param inputs input NDArrays
   * \param num_outputs number of output NDArrays
   * \param outputs output NDArrays
   * \param out_stypes output ndarrays' stypes
   * \param thread_safe whether to invoke thread safe version of cached op.
   * \return 0 when success, -1 when failure happens
   */

  
  MXNET_DLL int 

Re: ONNX Support

2019-10-07 Thread Anirudh Acharya
Hi Sam, Lin and Chaitanya,

I am sorry I am not aware of anyone who is willing to actively maintain the
ONNX module. The last commit was made by https://github.com/vandanavk. I am
not sure how much time vandanavk@ can dedicate to this.

I am okay with what the community collectively decides on these tests(
enabling or disabling). The purpose of my previous mail was to let the
community know that there are users of the ONNX module and that there is
some activity regarding code changes in that module.


Thanks
Anirudh


On Mon, Oct 7, 2019 at 1:19 PM Skalicky, Sam 
wrote:

> Hi Chai,
>
> If there is no one maintaining MXNet-ONNX support (or no one currently
> available to help debug issues), then we shouldn’t block forward progress
> because of failing ONNX tests.
>
> It would be great if someone wanted to work with Chai to debug the failing
> tests. But I do not see any forward plans/proposals to continue to develop
> or even just maintain the current ONNX support.
>
> Anirudh, if you can point those who are willing to maintain the ONNX
> support to the issue Chai mentioned that would be a good place to start.
> But if not, we should help Chai continue the great work he’s doing by
> disabling the failing tests (like we normally do for any failing/flaky
> tests already)
>
> Sam
>
> > On Oct 7, 2019, at 12:45 PM, Anirudh Acharya 
> wrote:
> >
> > Hi Chaitanya,
> >
> > The last I checked( a couple of months back) there are a few
> > customers/users of MXNet in Amazon who use ONNX in production.
> >
> > The last commit for ONNX module was on Aug 29th
> > - b7cca015553d707cd1c4ce292826d7311309419c
> >
> > So IMO disabling any of the tests is not a good idea.
> >
> >
> > Thanks
> > Anirudh
> >
> >
> > On Mon, Oct 7, 2019 at 12:27 PM Chaitanya Bapat 
> > wrote:
> >
> >> Hello MXNet community,
> >>
> >> I wanted to know if MXNet should continue support for ONNX. Is there
> anyone
> >> actively working on MXNet ONNX or maintaining it?
> >>
> >> If not, can we skip/disable the ONNX tests from the CI.
> >> Reason - Whilst working on a Transpose operator PR [1], I encountered
> >> failure for ONNX [2]. Given operator passes rest of the CI pipeline
> tests.
> >> I am able to reproduce the error. However, the root cause for ONNX model
> >> failure couldn't be found. Moreover, there seems to be near zero
> activity
> >> as far as PR check-ins are concerned.
> >>
> >> How does ONNX fit in for MXNet going forward?
> >> Thank you
> >> Chai
> >>
> >>
> >> [1] https://github.com/apache/incubator-mxnet/pull/16104
> >> [2]
> >>
> >>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-16104/14/pipeline
> >>
> >> --
> >> *Chaitanya Prakash Bapat*
> >> *+1 (973) 953-6299*
> >>
> >> [image: https://www.linkedin.com//in/chaibapat25]
> >> <https://github.com/ChaiBapchya>[image:
> https://www.facebook.com/chaibapat
> >> ]
> >> <https://www.facebook.com/chaibapchya>[image:
> >> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> >[image:
> >> https://www.linkedin.com//in/chaibapat25]
> >> <https://www.linkedin.com//in/chaibapchya/>
> >>
>
>


Re: ONNX Support

2019-10-07 Thread Anirudh Acharya
Hi Chaitanya,

The last I checked( a couple of months back) there are a few
customers/users of MXNet in Amazon who use ONNX in production.

The last commit for ONNX module was on Aug 29th
- b7cca015553d707cd1c4ce292826d7311309419c

So IMO disabling any of the tests is not a good idea.


Thanks
Anirudh


On Mon, Oct 7, 2019 at 12:27 PM Chaitanya Bapat 
wrote:

> Hello MXNet community,
>
> I wanted to know if MXNet should continue support for ONNX. Is there anyone
> actively working on MXNet ONNX or maintaining it?
>
> If not, can we skip/disable the ONNX tests from the CI.
> Reason - Whilst working on a Transpose operator PR [1], I encountered
> failure for ONNX [2]. Given operator passes rest of the CI pipeline tests.
> I am able to reproduce the error. However, the root cause for ONNX model
> failure couldn't be found. Moreover, there seems to be near zero activity
> as far as PR check-ins are concerned.
>
> How does ONNX fit in for MXNet going forward?
> Thank you
> Chai
>
>
> [1] https://github.com/apache/incubator-mxnet/pull/16104
> [2]
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-16104/14/pipeline
>
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> ]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>
>


Re: [Announcement] New Committer - Anirudh Acharya

2019-09-29 Thread Anirudh Acharya
Thank you, everyone.

-
Anirudh

On Sun, Sep 29, 2019 at 4:27 AM Kshitij Kalambarkar <
kshitijkalambar...@gmail.com> wrote:

> Congrats Anirudh!
>
> On Sun, Sep 29, 2019, 11:13 Sheng Zha  wrote:
>
> > Congrats! Now Anirudh is officially the most popular name among the MXNet
> > committers :P
> >
> > -sz
> >
> > On 2019/09/27 22:57:22, Furkan KAMACI  wrote:
> > > Hi,
> > >
> > > Congrats Anirudh!
> > >
> > > Kind Regards,
> > > Furkan KAMCI
> > >
> > > 28 Eyl 2019 Cmt, saat 01:34 tarihinde Marco de Abreu <
> > > marco.g.ab...@gmail.com> şunu yazdı:
> > >
> > > > Welcome!
> > > >
> > > >
> > > > On Sat, Sep 28, 2019 at 12:06 AM Chaitanya Bapat <
> chai.ba...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Congratulations Anirudh! Well deserved!
> > > > >
> > > > > On Thu, 26 Sep 2019 at 10:10, Chris Olivier <
> cjolivie...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Please join me in welcoming Anirudh Acharya as a new committer of
> > > > Apache
> > > > > > MXNet (incubating)!
> > > > > >
> > > > > > Anirudh Acharya has been contributing to the MXNet project for a
> > year
> > > > and
> > > > > > half now and has made several improvements to the MXNet R project
> > and
> > > > > > continues to contribute by adding optimizers, fixing tests and
> > actively
> > > > > > providing feedback on the PRs and has good understanding of
> > building
> > > > > > operators in MXNet and architecture in general.
> > > > > >
> > > > > > Welcome, Anirudh!
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Chaitanya Prakash Bapat*
> > > > > *+1 (973) 953-6299*
> > > > >
> > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > <https://github.com/ChaiBapchya>[image:
> > > > https://www.facebook.com/chaibapat
> > > > > ]
> > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> > > > >[image:
> > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > >
> > > >
> > >
> >
>


Re: new website, docs code freeze

2019-09-26 Thread Anirudh Acharya
Hi,

In the operator tutorial(
http://mxnet.incubator.apache.org/api/faq/add_op_in_backend), there are
sections which do not render properly, for example - forward function,
backward function and shape inference.


Thanks
Anirudh




On Wed, Sep 25, 2019 at 7:53 AM Aaron Markham 
wrote:

> I'm seeing GA code is -1 not -11 in the analytics admin console. 11
> was for beta.mxnet.io.
> Either way, the Jekyll prod config file was missing the GA code, so I
> added it with this PR:
> https://github.com/apache/incubator-mxnet/pull/16271
>
> Reindexing of the site is being tracked here:
> https://issues.apache.org/jira/browse/INFRA-19144
>
> .htaccess testing was hampered by it not working on staging. This was
> tracked here, and it looks like infra just patched staging so we can
> resume redirect testing:
> https://issues.apache.org/jira/browse/INFRA-19075
> I have a CI pipeline for beta testing. If anyone wants to contribute
> to working on the redirects, you can use this pipeline to publish to
> the beta staging site.
> http://jenkins.mxnet-ci.amazon-ml.com/job/restricted-website-publish-beta/
> I've distilled this information in this issue:
> https://github.com/apache/incubator-mxnet/issues/16273
> I'd much rather have another contributor work on this since it will
> teach testing changes on the website, testing CI deployments to
> staging using your fork, previewing on staging, and finally deploying
> it to prod. I'm happy to help & guide along the way.
>
> (echoing Thomas) Please be sure to raise new issues in the repo, so we
> don't lose them in this thread. Also, more people can work on them. It
> would great if others can jump in and get familiar with the new site
> and start contributing patches.
>
> Cheers,
> Aaron
>
> On Wed, Sep 25, 2019 at 3:15 AM Thomas DELTEIL
>  wrote:
> >
> > @Philip Yes we're looking at link redirects for older links that might be
> > hosted externally (using htaccess is my preferred way to handle it for
> now
> > as you sugested) and we'll use a broken link checker to update the links
> > that are hosted internally. We'll update the 404 to add an explanation on
> > the website update. Google indexes will slowly update across the week so
> > the google search issues will be less of a problem.
> >
> > If you find any such links yourself, or missing tutorials, please
> consider
> > stepping up and helping fixing them. The more people get familiar with
> the
> > new website architecture, the least likely it is to fall in a state of
> > stalled updates like the previous one.
> >
> > For the sphinx issues in the python mini-website, missing API classes, if
> > anybody is familiar with it, I'd love for us to bring back the automatic
> > doc generation for each package so at least we have a list of all
> available
> > classes in each sub package rather than relying on manual insertion of
> each
> > class, which is brittle and not future proof. @Lin, Haibin
> >  if you have experience with it, could we sync up
> > offline on how you suggest to do that based on your gluon-nlp experience?
> >
> > @Marco, I'm currently traveling for ICDAR in Sydney, and Aaron is on PTO
> in
> > Europe, I'll try make time today to help with the fixes since it is
> > impacting a lot of users.
> >
> > In the meanwhile, any help is appreciated, and more than the value of the
> > fixes, let me repeat that there is tremendous value in having more people
> > familiar with the website build pipelines. Aaron is the main owner for
> the
> > docs but he is already super busy with all his other responsibilities.
> I'm
> > available to help if anybody is stuck. I believe Aaron has updated the
> > READMEs on how to test the websites locally, if they're not clear, feel
> > free to contribute your own explanations or ask for help directly to me
> by
> > email or on the discuss forum.
> >
> > Good hunting!
> >
> > Thomas
> >
> >
> >
> > Le mer. 25 sept. 2019 à 10:10, Marco de Abreu 
> a
> > écrit :
> >
> > > Good catch, Mu! Also good idea, Philip!
> > >
> > > Aaron and Thomas, are you going to work on this?
> > >
> > > -Marco
> > >
> > > On Wed, Sep 25, 2019 at 1:28 AM Mu Li  wrote:
> > >
> > > > The questions I found are:
> > > >
> > > > 1. Not ever page contains, especially the homepage
> > > >
> > > >
> > >
> http://mxnet.incubator.apache.org/api/python/docs/_static/google_analytics.js
> > > > 2. The correct tracking id is UA-96378503-1 instead of
> UA-96378503-11 in
>

Re: mxnet ctrl-c

2019-09-23 Thread Anirudh Subramanian
That would be great. Thanks Chris!

On Mon, Sep 23, 2019 at 12:11 PM Chris Olivier 
wrote:

> thanks for the response. i trying to write to a “gpu” (sort of) with mxnet
> and sometimes takes a long time and having no way to interrupt it
> gracefully is “bad”. I will try to experiment with chaining back to the
> python through normal signal channels. if i can get it to work i’ll post a
> PR.
>
> On Mon, Sep 23, 2019 at 12:00 PM Anirudh Subramanian <
> anirudh2...@gmail.com>
> wrote:
>
> > Currently I don't see any special handling in the code base for this. We
> > have atexit.register which invokes MXInvokeShutdown from python but that
> > doesnt work for signals.
> >
> > Anirudh
> >
> > On Sun, Sep 22, 2019 at 7:30 PM Chris Olivier 
> > wrote:
> >
> > > question: how does gluon handle ctrl-c during a “long” imperative
> > operation
> > > where the GLI hasn’t been released yet? is it supposed to be caught in
> > c++
> > > or python or no special handling for it at the moment?
> > >
> >
>


Re: [DISCUSS] Assigning Issues

2019-09-12 Thread Anirudh Subramanian
+1

On Thu, Sep 12, 2019 at 1:15 PM Zach Kimberg 
wrote:

> We had a discussion a while back about trying to improve the way we handle
> issues by assigning them to users who are working on them. However, the
> discussion ended because issues could only be assigned to those with write
> access (committers).
>
> I just came across a new Github feature where issues can also be assigned
> to any user who comments on an issue [
> https://github.blog/2019-06-25-assign-issues-to-issue-commenters/].
> Committers can then assign anyone from the community who wants to work on
> the issue so we can track which issues are assigned and which ones are not.
> Assigned community members still have an "Unassign me" button if they no
> longer wish to work on an issue. It is also possible to assign up to 10
> people to an issue (or PR).
>
> Given this, I think we should try to assign issues when possible to those
> working on them. What does everyone think?
>
> Zach
>


Re: [DISCUSS] Remove amalgamation

2019-09-10 Thread Anirudh Subramanian
Hi Pedro,

I don't see anything "destructive" with Chris asking for justification for
you calling something "hacky". The only email in this thread where I see ad
hominems and disrespectful comments is your email.

On Sat, Sep 7, 2019, 10:18 PM Pedro Larroy 
wrote:

> Apache mentors should have a look at these reincident harassment and
> destructive behaviors which demotivate contributions and take action. It
> takes only one bad apple to ruin a community.
>
> The mobile solution that is known to work as of know is cross compiling
> with "ci/build.py -p build.android_armv8" or "build.android_armv7". The
> only advantage of amalgamation is to provide a smaller binary that we could
> accomplish with the C preprocessor.
>
> My technical contributions speak for themselves, including porting MXNet to
> Android and ARM and helping many users run MXNet in Jetson, Raspberry Pi
> and Android amongst many other topics. I have never been disrespectful to
> anyone. I'm entitled to my own technical opinions about amalgamation or any
> other piece of code whatsoever, that's no personal disrespect to anyone and
> perfectly valid. If you are not interested in this project anymore, do us
> all a favor and stop trolling and being toxic. If you want my respect, step
> up your technical contributions, be positive and encourage others, this
> including commits, I haven't seen for many months, please be positive and
> constructive. This scorched-earth attitude is only reflecting bad on you.
> I'm certainly not interested in your ad-hominems or unasked for technical
> advice, which to be honest,  showing poor judgment and ignorance. Myself
> and others have come up with numbers, graphs, metrics and arguments and
> have been met with dismissal, trolling and sea-lioning. I have recieved
> your insults via public and private channels (such as linkedin) as have
> others. This is not ok and has to stop. If you have something personal
> against me or against your former employer, this is not the right place or
> forum.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Sep 6, 2019 at 3:56 PM Chris Olivier 
> wrote:
>
> > Hi Pedro,
> >
> > While I was not involved with amalgamation or its development in any way,
> > can you please refrain from referring to the work of others as a "hacky
> > solution"?  This is derogatory slang and the statement was not supported
> > with any justification for such name-calling.  Someone spent a good deal
> of
> > time on this solution at some point in time and I am sure it worked for
> its
> > purpose at that time -- I think it was used in the original javascript
> port
> > as well, actually -- and it is disrespectful to call their efforts
> > "hacky".  Please respect what came before.
> >
> > Thanks for understanding,
> >
> > -Chris
> >
> >
> > On Fri, Sep 6, 2019 at 3:07 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > I would like to propose to remove amalgamation from MXNet and CI, users
> > > have reported that they couldn't use it successfully in Android, and
> > > instead they were able to use the cross compiled docker build
> > successfully.
> > >
> > > Any reason why we shouldn't remove this hacky solution?
> > >
> > > Pedro.
> > >
> >
>


Re: [Discuss] MXNet Python 2 Support Deprecation

2019-07-18 Thread Anirudh Acharya
+1

On Thu, Jul 18, 2019 at 11:03 AM Marco de Abreu 
wrote:

> +1
>
> -Marco
>
> Sheng Zha  schrieb am Do., 18. Juli 2019, 19:59:
>
> > Dear MXNet community,
> >
> > I'd like to reopen the discussion on deprecating python2 support. This
> > would help modernize the design and engineering practice in MXNet to help
> > improve speed and quality.
> >
> > For this purpose, I reopened the issue on this here:
> > https://github.com/apache/incubator-mxnet/issues/8703
> >
> > If the consensus is towards the direction of dropping python2 support, I
> > suggest we announce our plan to drop python2 support in the next release,
> > and actually drop the support in the next major version. Thanks.
> >
> > -sz
> >
>


Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc1

2019-06-20 Thread Anirudh Subramanian
Hi Lai,

I have opened an issue:
https://github.com/apache/incubator-mxnet/issues/15297
I came to know about this issue only today and I have not been monitoring
sockeye.
I jumped onto this issue to make sure it wasn't caused by the dlpack
changes.
Also, I don't  think sockeye CI checks against master, it is using 1.4.1.

Anirudh


On Thu, Jun 20, 2019 at 6:17 PM Lai Wei  wrote:

> Hi,
>
> Could you share which test failed and what’s the crash? How to reproduce
> it?
>
> I was able to install sockeye and run all tests passed. Using
> python setup.py test
>
> I have tested both nightly pip package and 1.5.0.rc1
>
> It would be great to create an issue with reproducible steps and move the
> discussion there.
>
> Also I see sockeye nightly build[1] has been failing for some time, if it’s
> due to MXNet change, please raise this early so we can track and solve it
> in time rather than block the release during vote time.
>
> [1] https://travis-ci.org/awslabs/sockeye
>
>
> On Fri, Jun 21, 2019 at 7:01 AM Anirudh Subramanian  >
> wrote:
>
> > I was able to reproduce a crash with the commit
> > 09202f7f261954383aa387144524d38f83f18d06 but not with the commit
> > a862270beb2d796c1ba311183f7f4a766a18ad6c.
> >
> > Anirudh
> >
> > On Thu, Jun 20, 2019 at 3:53 PM Lai Wei  wrote:
> >
> > > Hi Przemyslaw,
> > >
> > > Is there an issue with more details to track the problem?
> > >
> > >
> > > On Fri, Jun 21, 2019 at 6:04 AM Przemysław Trędak 
> > > wrote:
> > >
> > > > -1
> > > >
> > > > There is a crash in sockeye unit test (python setup.py test) observed
> > > > starting with nightly 1.5 build from 6/13 and still occuring in
> > 1.5rc1. I
> > > > don't yet have the exact commit that is responsible for it, but it is
> > > > either a862270beb2d796c1ba311183f7f4a766a18ad6c (dlpack related) or
> > > > 09202f7f261954383aa387144524d38f83f18d06 (cached op optimization).
> > > >
> > > > On 2019/06/20 06:36:22, Lai Wei  wrote:
> > > > > Dear MXNet community,
> > > > >
> > > > > This is the 3-day vote to release Apache MXNet (incubating) version
> > > > 1.5.0.
> > > > > Voting on dev@ will start June 19, 23:59:59(PST)  and close on
> June
> > > 22,
> > > > > 23:59:59.
> > > > >
> > > > > 1) Link to release notes:
> > > > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Notes
> > > > >
> > > > >
> > > > > 2) Link to release candidate:
> > > > >
> > > > > https://github.com/apache/incubator-mxnet/releases/tag/1.5.0.rc1
> > > > >
> > > > >
> > > > > 3) Link to source and signatures on apache dist server:
> > > > >
> > > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.5.0.rc1/
> > > > >
> > > > >
> > > > > Please remember to TEST first before voting accordingly:
> > > > >
> > > > > +1 = approve
> > > > > +0 = no opinion
> > > > > -1 = disapprove (provide reason)
> > > > > --
> > > > > Best Regards
> > > > >
> > > > > Lai
> > > > >
> > > >
> > > --
> > > Best Regards
> > >
> > > Lai
> > >
> >
> --
> Best Regards
>
> Lai
>


Re: [VOTE] Release Apache MXNet (incubating) version 1.5.0.rc1

2019-06-20 Thread Anirudh Subramanian
I was able to reproduce a crash with the commit
09202f7f261954383aa387144524d38f83f18d06 but not with the commit
a862270beb2d796c1ba311183f7f4a766a18ad6c.

Anirudh

On Thu, Jun 20, 2019 at 3:53 PM Lai Wei  wrote:

> Hi Przemyslaw,
>
> Is there an issue with more details to track the problem?
>
>
> On Fri, Jun 21, 2019 at 6:04 AM Przemysław Trędak 
> wrote:
>
> > -1
> >
> > There is a crash in sockeye unit test (python setup.py test) observed
> > starting with nightly 1.5 build from 6/13 and still occuring in 1.5rc1. I
> > don't yet have the exact commit that is responsible for it, but it is
> > either a862270beb2d796c1ba311183f7f4a766a18ad6c (dlpack related) or
> > 09202f7f261954383aa387144524d38f83f18d06 (cached op optimization).
> >
> > On 2019/06/20 06:36:22, Lai Wei  wrote:
> > > Dear MXNet community,
> > >
> > > This is the 3-day vote to release Apache MXNet (incubating) version
> > 1.5.0.
> > > Voting on dev@ will start June 19, 23:59:59(PST)  and close on June
> 22,
> > > 23:59:59.
> > >
> > > 1) Link to release notes:
> > > https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Notes
> > >
> > >
> > > 2) Link to release candidate:
> > >
> > > https://github.com/apache/incubator-mxnet/releases/tag/1.5.0.rc1
> > >
> > >
> > > 3) Link to source and signatures on apache dist server:
> > >
> > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.5.0.rc1/
> > >
> > >
> > > Please remember to TEST first before voting accordingly:
> > >
> > > +1 = approve
> > > +0 = no opinion
> > > -1 = disapprove (provide reason)
> > > --
> > > Best Regards
> > >
> > > Lai
> > >
> >
> --
> Best Regards
>
> Lai
>


Re: CUDA / CUDNN support revisited

2019-06-18 Thread Anirudh Subramanian
+1, Agree this should be done for both CUDA and CUDNN versions. At max CUDA
Version N and CUDA Version N - 1 should be supported in CI.

My question is what happens, when we are at a position, where we are on a
CUDA version N and removed support for CUDA version N - 1. Within a small
duration Nvidia comes up with a CUDA patch version N + 1, where  some perf
regressions and some bugs have been fixed. Should we just move to N + 1,
since version N will have all these issues for users and may also slow us
down on CI.

I am facing a issue with CUDA 10 and CUDA 10.1 which also seems to be
causing intermittent CI failures:
https://github.com/apache/incubator-mxnet/issues/15273 . There is already a
PR to bump up Nvidia version to 10.1 (
https://github.com/apache/incubator-mxnet/pull/14986/files).

I think for situations where there is a quick follow up release like 10.1
and MXNet users are impacted by certain issues, we should just bump up the
version and stop support for 10.0.
Would like to hear more from Nvidia folks (on this particular case of CUDA
10.0 vs CUDA 10.1 and what are the recommendations for existing customers).

Anirudh

On Mon, Jun 3, 2019 at 4:21 PM Dick Carter  wrote:

> Actually, I tried to say that support *doesn't necessarily* include N-1.
> I'm proposing that the supported versions are 1) covered by CI and 2) have
> been available in a usable form long enough that a semi-motivated user has
> been able to transition to it.  That might mean only N (e.g. per my
> proposal, only cuDNN v7).
>
> Regarding precedent for N / N-1,  when a new CUDA version comes out, users
> will transition to it at their own pace, thereby creating a N / N-1 support
> situation for some period.
>
>
> On 2019/06/03 22:43:20, Pedro Larroy 
> wrote:
> > Your proposal of having support for N and N-1 makes a lot of sense to
> > me. Are there use cases for supporting older CUDA versions?
> >
> >
> > Thanks.
> >
> > On Mon, Jun 3, 2019 at 3:06 PM Dick Carter  wrote:
> > >
> > > I'd like to revisit the discussion of:
> https://lists.apache.org/thread.html/27b84e4fc0e0728f2e4ad8b6827d7f996635021a5a4d47b5d3f4dbfb@%3Cdev.mxnet.apache.org%3E
> now that a year has passed.
> > >
> > > My motivation is:
> > >
> > > 1.  There's a lot of hard-to-read  '#if CUDNN_MAJOR' code referencing
> cuDNN versions back as far as v4(!?).  We need to clean this out before it
> hampers our ability to nimbly move the codebase forward.
> > >
> > > 2.  There seems to be a difference of opinion on whether we should be
> supporting version 'N-1' (e.g. cuDNN6).  Our current MXNet 1.5 candidate
> does not compile against cuDNN v6, so this should be either fixed or be
> up-front stated to the user community.  The breaking PR was
> https://github.com/apache/incubator-mxnet/pull/14476.
> > >
> > > Having read the prior discussion, my take on it is:
> > >
> > > - Users should be given an ample time period (1 year?) to move to a
> new CUDA/cuDNN version once it becomes 'usable.'
> > >
> > > - We should not claim to support a given version if it is no longer
> part of the MXNet CI.  User's should be warned of an impeding dropping of
> this 'testing support.'
> > >
> > > So these statements do not necessarily promise 'N-1' support.  I could
> see a transitioning of the CI from CUDA9-only -> CUDA9&10 -> CUDA10 only.
> Some period before CUDA9 is dropped from CI, the user community is warned.
> After that time, CUDA10 might be the only version tested by CI, and hence
> the only version supported (until the next CUDA version came around).
> > >
> > > Let me propose as a 'strawman' that we claim to support CUDA version 9
> and 10, with cuDNN version 7 only.  Those versions have been out for over
> 1.5 years.  So no CUDA 8 or cuDNN v6 support- over 1.5 years old with no
> coverage by our CI.
> > >
> > > -Dick
> >
>


Re: Making new operators and AMP lists

2019-05-30 Thread Anirudh Subramanian
> HOWEVER, as I was writing this reply I realized that due to pure luck
this is not actually what happens - optimizers could in fact be in the
FP32_FUNCS list. That is because, as AMP's assumption is that the model
being changed is FP32 model at the start, all weights (and so all the
gradients just before application as well) are stored in FP32 already (and
so the casting would not actually occur). It would therefore be possible to
make the FP32 list to be implicit (although relying on this lucky
coincidence makes me slightly uncomfortable).

This is under the assumption that the customers use the default lists. If
customer use their own lists, it can happen that some weights have to be in
FP16 and they may have to do some heavy lifting.  My understanding is that
this is the use case that makes you uncomfortable ? If not can you expand ?
The common use cases will be customer using default lists, and the later
use case may be rare.

Anirudh


On Thu, May 30, 2019 at 3:25 PM Sheng Zha  wrote:

> Given that we're swtiching from the practice of failing the AMP related
> test to warning, I intend to merge #15085 soon if no objection.
>
> -sz
>
> On 2019/05/30 19:07:07, Przemys��aw Tr��dak  wrote:
> > Hi Sam and Zhennan,
> >
> > the problem is not how to implicitly produce a list of all operators not
> in any other list - that is easy and the code Zhennan provided would work.
> The problem is that such list would not be actually correct in all cases -
> you do NOT want optimizers to land in FP32_FUNCS list, because then the
> tensor being updated could potentially be the FP32 copy and not the actual
> tensor you care about. That is why the first bullet point in the error
> message in the test is along the lines of: "if you do an optimizer or
> anything else that does not go in the computational graph, put it in the
> FP16_FP32_FUNCS list" - that list does not do any wrapping.
> >
> > HOWEVER, as I was writing this reply I realized that due to pure luck
> this is not actually what happens - optimizers could in fact be in the
> FP32_FUNCS list. That is because, as AMP's assumption is that the model
> being changed is FP32 model at the start, all weights (and so all the
> gradients just before application as well) are stored in FP32 already (and
> so the casting would not actually occur). It would therefore be possible to
> make the FP32 list to be implicit (although relying on this lucky
> coincidence makes me slightly uncomfortable).
> >
> > Przemek
> >
> >
> >
> > On 2019/05/30 08:04:05, "Qin, Zhennan"  wrote:
> > > How about change the below line in amp.py:164
> > >
> > > wrap_list = fp32_ops if fp32_ops is not None else
> > > lists.symbol.FP32_FUNCS
> > >
> > > to be
> > >
> > > plist = ctypes.POINTER(ctypes.c_char_p)()
> > > size = ctypes.c_uint()
> > >
> > > check_call(_LIB.MXListAllOpNames(ctypes.byref(size),
> > >  ctypes.byref(plist)))
> > > op_names = []
> > > for i in range(size.value):
> > > op_names.append(py_str(plist[i]))
> > >
> > > wrap_list = []
> > > fp16_op_list = lists.symbol.FP16_FUNCS +
> > > lists.symbol.WIDEST_TYPE_CASTS + lists.symbol.FP16_FP32_FUNCS +
> > > list(map(lambda x: x[0], lists.symbol.CONDITIONAL_FP32_FUNCS))
> > > for op_name in op_names:
> > > if not op_name.startswith("_backward_") and not
> > > op_name.startswith("_contrib_backward_") and op_name not in
> > > fp16_op_list:
> > > wrap_list.append(op_name)
> > > wrap_list = fp32_ops if fp32_ops is not None else wrap_list
> > >
> > >
> > > I've checked, the changed code can produce identity wrap_list as
> > > lists.symbol.FP16_FP32_FUNCS.
> > >
> > > The check code is,
> > >
> > > print("op that in wrap_list but not in FP32_FUNCS:")
> > > for i in wrap_list:
> > > if i not in lists.symbol.FP32_FUNCS:
> > > print(i)
> > >
> > > print("op that in FP32_FUNCS but not in wrap_list:")
> > > for i in lists.symbol.FP32_FUNCS:
> > > if i not in wrap_list:
> > > print(i)
> > >
> > > The output is,
> > >
> > > op that in infered_fp32_op_list but not in FP32_FUNCS:
> > > op that in FP32_FUNCS but not in infered_fp32_op_list:
> > > op that in infered_fp32_op_list but not in FP32_FUNCS:
> > > op that in FP32_FUNCS but not in infered_fp32_op_list:
> &

Re: Making new operators and AMP lists

2019-05-28 Thread Anirudh Subramanian
The assumption is the AMP requirement is something that has a steep
learning curve. Developers may get confused by the name, but the question
the developer has to essentially answer is (and this can be added in the
error):
1. If the operator can run in FP16 and FP32 modes, put it in
FP16_FP32_FUNCS.
2. If you want to always run the op in FP16 because of the benefits, put it
in FP16.
3. Not sure, Don't know or dont care: FP32 list.

Any operator developer who wrote InferType logic for their operator won't
take much time to choose one of these options.
Now it may be possible that it took a long time before developer realizes
that he has to do this,
thats why I suggested we need to reduce the time it takes for him to
realize that something was missed.

Anirudh

On Tue, May 28, 2019 at 4:57 PM Sheng Zha  wrote:

> This is driving people away exactly because they don't know this is what's
> asked of them, and why they are asked of this AMP requirement in the first
> place. To someone who's already familiar with the context, this is little
> to be worried about. It's now the process that requires everyone to become
> familiarized with the AMP requirement that becomes costly. More
> importantly, this is not a question of whether it's too much, but whether
> it should be there in the first place. If it's merely a question of cost I
> imagine you'd have no trouble stepping up and support all of the future
> operators for AMP :)
>
> -sz
>
> On 2019/05/28 23:49:52, Marco de Abreu  wrote:
> > I'm having trouble in how far adding the name of an operator in a single
> > file is too much to expect from somebody and how this is driving people
> > away.
> >
> > If somebody adds a tutorial, they also have to add the tutorial to a
> > specific file. As far as I can tell, this has not resulted in people not
> > wanting to write tutorials anymore or it being considered as such a big
> > burden.
> >
> > So far, I'm really not following why adding a single line to a single
> file
> > is considered such a big deal. Considering how long this guide [1]
> already
> > is, what's the harm in adding this as an additional instruction? (AMP is
> > not mentioned there yet, it would be great if that could be follow up)
> >
> > [1]
> >
> https://mxnet.incubator.apache.org/versions/master/faq/add_op_in_backend.html
> >
> > -Marco
> >
> > On Wed, May 29, 2019 at 1:42 AM Sheng Zha  wrote:
> >
> > > AMP is in contrib so there's no guarantee that the API is final.
> Adopting
> > > the test as-is is harmful because operator authors should not be
> required
> > > to invest in an experimental feature that they are not aware of.
> > >
> > > I'm all for openness and welcoming, but think about whether you'd like
> to
> > > turn away developers who just want to write a CPU-only operator. The
> more
> > > you impose on the developers the less likely they will make the
> > > contribution through.
> > >
> > > Having an unfamiliar operator in AMP as a warning could let everyone
> know
> > > what the support state is whenever that feature is used. For those who
> care
> > > about this, they would see the warning and add the support to get the
> speed
> > > benefit of not casting to fp32. In this case, rather than imposing it
> to
> > > developers who don't know about AMP, the one who actually uses AMP and
> > > cares about this feature would drive the work forward.
> > >
> > > -sz
> > >
> > > On 2019/05/28 23:25:43, Marco de Abreu 
> wrote:
> > > > While AMP might be an experimental feature, I rather would like to
> put
> > > the
> > > > focus on the maturity of its interfaces. If the interfaces and the
> > > actions
> > > > developers have to do aren't finalized yet, I'd agree with disabling
> the
> > > > test. But if the API is final and easy to use, I don't see why
> adopting
> > > > early on would be harmful. But from what I can see, the output of the
> > > test
> > > > is very meaningful and explicit, easily understandable and offers the
> > > > developer a clear list of action items that they can follow.
> > > >
> > > > If people actually start commenting "CI test failure seems unrelated
> to
> > > my
> > > > change, please proceed and merge", we should advise them to please
> open
> > > the
> > > > result tab, which will directly show the clear list of action items.
> > > > Committers should support these contributors who are not that
> familiar
> > &

Re: Making new operators and AMP lists

2019-05-28 Thread Anirudh Subramanian
Hi,

I agree with Marco there are some easy wins to be had since many new  GPU
operators come with FP16 support.
I think we can explore the overhead to the developer and try to reduce the
feedback time for the developer, so
that cost associated with adding support for AMP feature is minimized.
Also, this will be very important once we move the feature out of contrib.

Anirudh

On Tue, May 28, 2019 at 3:52 PM Marco de Abreu 
wrote:

> Hi,
>
> I'm generally in favour of these kind of tests since they make developers
> aware of changes they have to make which they would usually not be aware
> of. We have a similar test for tutorials, for example. Whenever somebody
> adds a tutorial, there's a validation that assures that all contraints in
> our testing environment are met and that they are properly tied into the
> system. This AMP test fits into the same category in my opinion and we
> never heard bad feedback about these kind of checks.
>
> What seems to be bothering people is the fact that the feedback time is too
> high. Thus, I'd like to propose to move the test into the sanity-test stage
> instead of doing it as part of the unit tests which take quite a bit of
> time until they're actually executed. The sanity checks run immediately and
> give a response within about 1 minute.
>
> While I understand that this might increase the amount of work a developer
> has to do if they develop a new operator, I think that this is the right
> thing to do. Developers won't know of every single feature other people
> worked on and thus might simply miss adding the support for it. This kind
> of test on the other hand makes them aware of it. If they'd like to opt
> out, it's one single line they would have to change and then they're
> totally fine. On the other hand, this might motivate them to add the
> support since the kernel would be the last piece and everything else would
> already be implemented.
>
> Considering how often a PR gets declined because of linting errors, I'd say
> that these kind of errors are WAY more frequent that AMP telling somebody
> to add their operator to a list. Considering that this would only have to
> be done once per operator, that's work of about one minute. Add that to the
> waiting time of the sanity check and you're left with about five "wasted"
> minutes.
>
> I'm opposed towards adding a warning or treating them as float32 by default
> since the operator author wouldn't notice. What will happen is that people
> won't know about AMP and simply forget about low precision in general until
> they're actively reminded. This check will remind them actively and thus
> bring more attention to the feature. I know that the feature is still
> experimental, but we have just started with the 1.6 branch and thus there's
> enough time to make the experimental features production ready. Adding this
> test early on will allow others to add the support for AMP during the early
> stage of the 1.6 branch instead of asking them in the last few weeks before
> a release. The result would only be that stuff is rushed or forgotten.
>
> To sum it up: I think this test is good and it should be kept as error, but
> it should be moved to sanity checks.
>
> -Marco
>
> On Wed, May 29, 2019 at 12:21 AM Sheng Zha  wrote:
>
> > Thanks for initiating the discussion.
> >
> > The premise for adding the test was to make sure that AMP feature is "not
> > broken", but that's IMO not the right view. AMP is not supposed to
> support
> > a new operator it hasn't seen before in the first place. There's no way
> for
> > it to know whether the fp32 cast should happen or not. So AMP feature
> > cannot provide the guarantee that it works for all future operators.
> Thus,
> > adding new operators to AMP list should be considered new feature instead
> > of fixing existing feature.
> >
> > The AMP test that breaks upon the addition of new operator is thus
> > equivalent to forcing developers of the new operator to add the new
> support
> > for AMP. This feels wrong. Especially given that AMP is an experimental
> > feature in contrib namespace (i.e. no semver guarantee), this practice
> > should be stopped immediately. We cannot force new developers to invest
> > into experimental feature this way.
> >
> > I'd suggest the following changes:
> > - for new operators that aren't registered in AMP, cast to float32 by
> > default and print one-time warning. People using AMP who want to avoid
> > casting can register it in the AMP's list.
> > - change the test to print warning about the operators that are not
> listed
> > so that it's easy to track the problem.
> >
> > -sz
> >
> > On 

Re: Making new operators and AMP lists

2019-05-28 Thread Anirudh Subramanian
Hi all,

I had discussion with Przemyslaw about this offline. There are two options
we can pursue to make developer experience better ( Since currently they
have to wait for CI to complete):

1. Obtain the current lists and check if the length of the combined lists
is same as MXListAllOpNames which gets trigger during "import mxnet". This
provides much early feedback to the developer instead of compiling, testing
and pushing code and waiting to be failed in the CI.
2. Option 1 still has the problem, that developer have to still add op name
to the lists, when adding a new alias. This pain point can be reduced by
adding additional attr for an operator (also suggested by Haibin earlier)
which will tell whether to force cast to FP32, FP16 or no casts. This way
adding a new alias wont cause additional burden on the developer and the
attr if not set can still  be caught early at the statement "import mxnet".

Thoughts?

Anirudh

On Tue, May 28, 2019 at 2:32 PM Przemysław Trędak 
wrote:

> Dear Community,
>
> One of the recently merged features of the 1.5 release, AMP (Automatic
> Mixed Precision) support (PR [1], design doc [5]), introduced a requirement
> that every new operator added to MXNet would need to be present in 1 of the
> lists (in [2]). To make sure that this requirement is not broken when
> somebody adds a new operator and does not know about AMP's existence, a
> test was added to CI ([3]).
>
> A few people reached out to me (the original author of the feature) saying
> this test increases a burden on a developer of new operators and should not
> be an actual error, but just warning (PR for that change [4]). That is why
> I would like to present a motivation for it and discuss with the wider
> audience why I feel it was necessary.
>
> First, for people who do not know the details of what AMP is - it is a
> solution that tries to automatically apply best practices of training in
> lower precision (FP16) to user's FP32 model in order to fully utilize
> capabilities of modern GPUs (and potentially other hardware in the future).
> It does so by casting to lower precision inputs to operators benefitting
> from it, while casting to full precision inputs of operators that are
> unsafe to run in lower precision or just do not support it.
>
> The first iteration of AMP kept 2 main lists of operators - operators that
> are beneficial and safe to do in fp16 and operators that need to be cast to
> FP32. The problem (raised in review of the PR [6], [8]) is how to make sure
> that the feature works as intended and is not inadvertently broken by
> somebody adding a new operator. The failure scenario here is adding a new
> operator that does not support FP16 and so should be cast to FP32, but AMP
> does not know about its existence and so does not do the casting. The
> solution proposed in the review was to implicitly treat all of the unknown
> operators as FP32-only and keep the list of operators that work fine in
> both FP16 and FP32. This solution however does not really work, because
> there are multiple operators (most notably optimizers) where introducing
> additional casting of the input to FP32 would break the operator.
>
> That is why after discussion with a few members of the community, I
> decided to proceed with all lists being explicit and introducing the test
> that would fail when somebody added an operator without classifying it into
> 1 of the categories, and explain clearly how to do it [7]. It is not ideal
> solution, as it introduces some burden on the developers who are not aware
> about AMP, however in the typical case of adding at most a few operators to
> MXNet the inconvenience is I think pretty minor while important for the
> feature correctness going forward.
>
> I would like to gather Community feedback and ideas how to handle this
> situation.
>
> [1] https://github.com/apache/incubator-mxnet/pull/14173
> [2]
> https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/contrib/amp/lists/symbol.py
> [3]
> https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_amp.py
> [4] https://github.com/apache/incubator-mxnet/pull/15085
> [5]
> https://docs.google.com/document/d/1sQzMoPEwux0WXSWirY07us1POD_6Y8pLYq--b9Fvd1o/edit?usp=sharing
> [6]
> https://github.com/apache/incubator-mxnet/pull/14173#discussion_r270728019
> [7]
> https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_amp.py#L62-L80
> [8]
> https://github.com/apache/incubator-mxnet/pull/14173#pullrequestreview-235846341
>


Re: [DISCUSS] 1.5.0 Release Plan

2019-05-15 Thread Anirudh Subramanian
Hi Lai,

>From the discussion I had with Nvidia offline they are targeting on pushing
the required changes today.
Since this is important feature for the release, if this gets delayed and
cannot  be merged by 05/17/2019,
the code freeze date may need to be changed.

Anirudh

On Wed, May 15, 2019 at 1:23 AM Lv, Tao A  wrote:

> Hi dev,
>
> We see there are several github issues [1][2][3][4] about mxnet windows
> build experience. The team is working intensively [5][6][7] on that to fix
> some problems of MKL-DNN build on windows. We hope these fixes can catch
> the code freeze and finally enter the 1.5.0 release.
>
> The PR against mshadow (#374) was already merged and MXNet PR #14877 is
> under review - great thanks to CI team for helping on the MKL installation
> request. PR #14952 is document change according to build logic changes in
> PR #14877. So I think these two PRs should be merged simultaneously.
> Currently #14877 is experiencing a CI response problem.
>
> Please take your time to have a look at these two PRs. Your comments and
> suggestions are highly appreciated.
>
> Thanks,
> -tao
>
> [1] https://github.com/apache/incubator-mxnet/issues/14670
> [2] https://github.com/apache/incubator-mxnet/issues/14335
> [3] https://github.com/apache/incubator-mxnet/issues/14203
> [4] https://github.com/apache/incubator-mxnet/issues/14085
> [5] https://github.com/apache/incubator-mxnet/pull/14877
> [6] https://github.com/dmlc/mshadow/pull/374
> [7] https://github.com/apache/incubator-mxnet/pull/14952
>
> -Original Message-
> From: Lai Wei [mailto:roywei...@gmail.com]
> Sent: Wednesday, May 15, 2019 2:57 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: [DISCUSS] 1.5.0 Release Plan
>
> Hi Anirudh,
>
> I see there was an offline disucssion
> <
> https://github.com/apache/incubator-mxnet/pull/14173#pullrequestreview-235846341
> >
> and I have updated the AMP feature and your project on the release tracker
> <
> https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+and+Status
> >
> ,
> Please let me know if you have any updates.
>
> Hi @dev,
> This is a gentle reminder that  the code freeze for 1.5.0 release is on
> 05/17/2019, please let us know if you have any WIP pull requests aiming for
> 1.5.0 that needs attention.
> Please understand we already have around 650 commits in master that need
> to be released in time. We understand TensorRT test in CI is failing and
> are trying to fix it. Meanwhile please update the tracker if there is any
> change:
>
> https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+and+Status
>
> Thanks!
>
> Lai
>
>
> On Wed, May 8, 2019 at 11:58 AM Anirudh Subramanian  >
> wrote:
>
> > Hi Sheng,
> >
> > I had a discussion with nvidia folks offline today (@ptrendx et. al.).
> > I strongly feel that the AMP feature should be included as part of the
> > release: https://github.com/apache/incubator-mxnet/pull/14173 .
> > The PR is aimed for completion for next week but reviews and RFC
> > discussions may take some time. I would request to extend the release
> > code freeze by 2 weeks.
> > Also, I would like to include
> >
> > https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32
> > +to+Mixed+Precision+Models
> > which
> > depends on the AMP PR.
> > I am also aiming for adding a PR by this week end or early next week,
> > but reviews will take longer than May 17th.
> >
> > Anirudh
> >
> >
> > On Mon, May 6, 2019 at 11:49 PM Sheng Zha  wrote:
> >
> > > Hi,
> > >
> > > While 1.4.1 vote on general@incubator is still on going, I’d like to
> > > propose that we start preparing 1.5.0 release.
> > >
> > > 1.5.0 will include changes that dates back to last year and there
> > > has
> > been
> > > a lot of new features and improvements in it, so it will likely time
> > > us more time to prepare than 1.4.1. I propose the following timeline:
> > > - Cut release branch: release branch already cut. Will sync with
> > > master branch on 5/15/2019 EOD.
> > > - Code freeze: 5/17/2019. No more changes unless the release branch
> > > is in a broken state.
> > > - Tag and vote: 5/20/2019 onward.
> > >
> > > Lai Wei (roywei@) expressed to me offline that he’s willing to help
> > drive
> > > this release as release manager, and I’m happy to help again as
> > committer.
> > >
> > > If you have features in progress that you’d like to include in 1.5.0:
> > > - Add your feature to the scope:
> > >
> > https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+a
> > nd+Status
> > > - Indicate in this thread:
> > >   - how confident you are about making it happen before the code
> freeze.
> > > If not confident, provide estimate for a more manageable code freeze
> > > date so that people can discuss whether to extend the deadline or to
> > > skip one release for it.
> > > - whether your PR requires more attention to make it happen.
> > >
> > > Thanks for your attention. Comments and suggestions are also welcome.
> > >
> > > -sz
> >
>


Re: [Proposal] New operator graph for MXNet

2019-05-15 Thread Anirudh Subramanian
Hi Junru,

Overall, I appreciate the points you made about the proposal.

Having said that, I would like to remind the Apache Code of Conduct :
https://www.apache.org/foundation/policies/conduct.
"Be empathetic, welcoming, friendly and patient".

I find your tone condescending. Clearly you understand what he meant from
the context whether you prefer to call IR in compilers or data-flow in
distributed systems. You could very well say lets use this terminology to
have a common understanding instead of saying go learn the basic concepts.
Before building a cool brand, its important to build a healthy community.

Anirudh


On Wed, May 15, 2019 at 12:03 AM Junru Shao  wrote:

> Hi Pedro,
>
> I really appreciate that a diligent and talented engineer eagerly wants to
> improve our system, and am very thankful that you have done so much for our
> community. However, I do want to mention some points that I believe I
> should mention.
>
> While I agree with Tianqi that every design has its pros and cons, I would
> love to emphasize that a *good taste* of system design is to optimize the
> bottleneck, enhance expressiveness (and usability), i.e. to do what needs
> doing, rather than *trivial nits* that are irrelevant to either performance
> or expressiveness. Generally speaking, typed or untyped, shared_ptr or
> unique_ptr, won't affect the overall performance when it comes to deep
> learning workload, specially when we have an async scheduler that does good
> latency hiding in MXNet - to me, these are not major issues that are worth
> re-designing our entire system.
>
> To benefit users - real-world ML practitioners, the most thing I would love
> to mention is that dataflow graph-based representation is increasingly
> incapable of modern neural networks, because the increasingly appeared
> structures like arbitrary control flow (w/ continue, break, etc),
> recursion, type conjunction and disjunction, etc. These issues will be our
> priority to address, which is brought by Relay, which addresses all these
> pain points.
>
> Another minor thing I would love to humbly mention is that, for sake of our
> brand, it is our responsibility to be professional about terminologies when
> writing an official proposal on Confluence. As one of the numerous
> examples, the title of the proposal really shocks me for a while, something
> like "operators graph" blah blah so weird. Educate me if I were wrong, but
> compiler community would prefer the term "intermediate representation", and
> distributed system community would prefer "dataflow graph". If you don't
> have knowledge in these fields, a better way for efficient communication is
> to get yourself first familiarize the most basic concepts and then do
> discussion. This is a way to save your own valuable time as well.
>
> Again, thank you so much for your hard work, and hope that we could work
> together to win customers in the future :-)
>
> Thanks,
> Junru
>
>
> On Tue, May 14, 2019 at 8:03 PM Tianqi Chen 
> wrote:
>
> > The core part of the proposal is to move the graph to be much more
> strongly
> > typed template class.
> > I think this is mainly a point of engineering taste, and both sides have
> > pros and cons, let me list them before I share my thoughts on this issue:
> >
> > - Typed fields certainly enjoy more compile-time type checking, on the
> > other hand, it is hard to expose
> >template of explosive possibilities to frontend languages.
> > - More type-erased fields provide runtime flexibility to store
> polymorphic
> > types as well as extensible attributes for graph optimization
> >   - It is hard to use a virtual class to expose every possible attribute
> > that an operator might have, such as inlining, storage pattern, gradient
> > etc..
> >   - The nature of supporting a growing set of operator attribute
> requires a
> > type-erased attrs field.
> > - In contrast to your argument(typing is a blocker to features),
> > type-erased or typed code can both get to the same feature except, except
> > that
> >   typed code gets more compile-time errors while type-erased get some of
> > them in runtime.
> > - Templatized data structures will likely introduce additional metal
> > burdens to developers and are not really suitable as a core data
> structure
> >- Because they imply an explosive number of possible data structures,
> > while the core data structure should be a single one.
> >
> > Now my view(as an MXNet PMC member) on typed vs type-erased style: If
> MXNet
> > is a pure C++ project, I might take more of the typed approach.
> > However, MXNet itself is a project that takes python/scala/cl

Re: Requesting slack access

2019-05-08 Thread Anirudh Subramanian
Sent invite!

On Wed, May 8, 2019 at 6:43 AM Sem  wrote:

> Requesting slack access
>
>


Re: [DISCUSS] 1.5.0 Release Plan

2019-05-08 Thread Anirudh Subramanian
Hi Sheng,

I had a discussion with nvidia folks offline today (@ptrendx et. al.). I
strongly feel that the AMP feature should be included as part of the
release: https://github.com/apache/incubator-mxnet/pull/14173 .
The PR is aimed for completion for next week but reviews and RFC
discussions may take some time. I would request to extend the release code
freeze by 2 weeks.
Also, I would like to include
https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models
which
depends on the AMP PR.
I am also aiming for adding a PR by this week end or early next week, but
reviews will take longer than May 17th.

Anirudh


On Mon, May 6, 2019 at 11:49 PM Sheng Zha  wrote:

> Hi,
>
> While 1.4.1 vote on general@incubator is still on going, I’d like to
> propose that we start preparing 1.5.0 release.
>
> 1.5.0 will include changes that dates back to last year and there has been
> a lot of new features and improvements in it, so it will likely time us
> more time to prepare than 1.4.1. I propose the following timeline:
> - Cut release branch: release branch already cut. Will sync with master
> branch on 5/15/2019 EOD.
> - Code freeze: 5/17/2019. No more changes unless the release branch is in
> a broken state.
> - Tag and vote: 5/20/2019 onward.
>
> Lai Wei (roywei@) expressed to me offline that he’s willing to help drive
> this release as release manager, and I’m happy to help again as committer.
>
> If you have features in progress that you’d like to include in 1.5.0:
> - Add your feature to the scope:
> https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+and+Status
> - Indicate in this thread:
>   - how confident you are about making it happen before the code freeze.
> If not confident, provide estimate for a more manageable code freeze date
> so that people can discuss whether to extend the deadline or to skip one
> release for it.
> - whether your PR requires more attention to make it happen.
>
> Thanks for your attention. Comments and suggestions are also welcome.
>
> -sz


Re: [VOTE] Release Apache MXNet (incubating) version 1.4.1.rc0

2019-05-04 Thread Anirudh Subramanian
No worries, maybe its just something with my setup.
Moving my vote to +0, pending someone else check.

On Fri, May 3, 2019 at 11:39 PM Junru Shao  wrote:

> Hi Anirudh,
>
> Thanks for reporting this!
>
> I verified on my EC2 machine for the second time. It perfectly builds with
> your commands. It is a bit weird...I noticed that there is a subtle
> difference that my ninja progress bar is like "[xxx/506]", while yours is
> "[xxx/488]". I am not sure if there is anything different between our
> settings.
>
> My understanding is that cmake should work because it is tested in our CI
> system under "ci/jenkins/incubator-mxnet" (
>
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/v1.4.x/201/pipeline
> ).
>
> It will be much appreciated if someone could help confirm whether cmake
> works on their side.
>
> Thanks,
> Junru
>
>
> On Fri, May 3, 2019 at 9:43 PM Anirudh Subramanian 
> wrote:
>
> > Hi Junru,
> >
> > I am on v1.4.x , and my dmlc-core commit is this one :
> >
> >
> https://github.com/dmlc/dmlc-core/tree/0a0e8addf92e1287fd7a25c6314016b8c0138dee
> >
> > Anirudh
> >
> > On Fri, May 3, 2019 at 8:30 PM Junru Shao 
> wrote:
> >
> > > Hey Anirudh,
> > >
> > > Although the vote has been closed, I am very interested in digging into
> > > this issue.
> > >
> > > I build on my EC2 machine using your instructions, and it seems that
> > > everything is working fine...
> > >
> > > Then, I noticed that your issue seems to be related to unittests in
> > > dmlc-core, not in mxnet. Could you kindly check the submodule git hash?
> > > Also, could you check if you are testing on v1.4.x branch?
> > >
> > > Thanks,
> > > Junru
> > >
> > >
> > >
> > > On Fri, May 3, 2019 at 4:33 PM Anirudh Subramanian <
> > anirudh2...@gmail.com>
> > > wrote:
> > >
> > > > -1 (binding)
> > > >
> > > > Is the cmake build failing for the 1.4.1 release tag ? Is this a
> known
> > > > issue ?
> > > >
> > > > Did the following:
> > > >
> > > > cd build && cmake VERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON
> > -DUSE_OPENMP=ON
> > > > -DCMAKE_BUILD_TYPE=Debug -DUSE_DIST_KVSTORE=0 -DUSE_OPENCV=1
> > > > -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCUDNN_ROOT=/usr/local/cuda
> > > > -DUSE_MKLDNN=1 -DUSE_MKL_IF_AVAILABLE=1 -DUSE_MKLML_MKL=1
> -DUSE_ASAN=0
> > > > -GNinja -DUSE_OPERATOR_TUNING=1 -DUSE_CPP_PACKAGE=0
> > -DCUDA_ARCH_NAME=Auto
> > > > .. && ninja -v
> > > >
> > > > [272/488] : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0
> > > > -msse2 -std=c++11 -fopenmp -g  -pthread
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
> > > >
> > > >
> > >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFile

Re: [VOTE] Release Apache MXNet (incubating) version 1.4.1.rc0

2019-05-03 Thread Anirudh Subramanian
Hi Junru,

I am on v1.4.x , and my dmlc-core commit is this one :
https://github.com/dmlc/dmlc-core/tree/0a0e8addf92e1287fd7a25c6314016b8c0138dee

Anirudh

On Fri, May 3, 2019 at 8:30 PM Junru Shao  wrote:

> Hey Anirudh,
>
> Although the vote has been closed, I am very interested in digging into
> this issue.
>
> I build on my EC2 machine using your instructions, and it seems that
> everything is working fine...
>
> Then, I noticed that your issue seems to be related to unittests in
> dmlc-core, not in mxnet. Could you kindly check the submodule git hash?
> Also, could you check if you are testing on v1.4.x branch?
>
> Thanks,
> Junru
>
>
>
> On Fri, May 3, 2019 at 4:33 PM Anirudh Subramanian 
> wrote:
>
> > -1 (binding)
> >
> > Is the cmake build failing for the 1.4.1 release tag ? Is this a known
> > issue ?
> >
> > Did the following:
> >
> > cd build && cmake VERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_OPENMP=ON
> > -DCMAKE_BUILD_TYPE=Debug -DUSE_DIST_KVSTORE=0 -DUSE_OPENCV=1
> > -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCUDNN_ROOT=/usr/local/cuda
> > -DUSE_MKLDNN=1 -DUSE_MKL_IF_AVAILABLE=1 -DUSE_MKLML_MKL=1 -DUSE_ASAN=0
> > -GNinja -DUSE_OPERATOR_TUNING=1 -DUSE_CPP_PACKAGE=0 -DCUDA_ARCH_NAME=Auto
> > .. && ninja -v
> >
> > [272/488] : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0
> > -msse2 -std=c++11 -fopenmp -g  -pthread
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_json.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_optional.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_main.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_env.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_thread_group.cc.o
> > -o 3rdparty/dmlc-core/test/unittest/dmlc_unit_tests  -rdynamic
> > lib/libgtestd.a 3rdparty/dmlc-core/libdmlc.a -lpthread && :
> > FAILED: : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0
> -msse2
> > -std=c++11 -fopenmp -g  -pthread
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
> >
> >
> 3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_json.cc.o
> >
>

Re: [VOTE] Release Apache MXNet (incubating) version 1.4.1.rc0

2019-05-03 Thread Anirudh Subramanian
-1 (binding)

Is the cmake build failing for the 1.4.1 release tag ? Is this a known
issue ?

Did the following:

cd build && cmake VERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_OPENMP=ON
-DCMAKE_BUILD_TYPE=Debug -DUSE_DIST_KVSTORE=0 -DUSE_OPENCV=1
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCUDNN_ROOT=/usr/local/cuda
-DUSE_MKLDNN=1 -DUSE_MKL_IF_AVAILABLE=1 -DUSE_MKLML_MKL=1 -DUSE_ASAN=0
-GNinja -DUSE_OPERATOR_TUNING=1 -DUSE_CPP_PACKAGE=0 -DCUDA_ARCH_NAME=Auto
.. && ninja -v

[272/488] : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0
-msse2 -std=c++11 -fopenmp -g  -pthread
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_json.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_optional.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_main.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_env.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_thread_group.cc.o
-o 3rdparty/dmlc-core/test/unittest/dmlc_unit_tests  -rdynamic
lib/libgtestd.a 3rdparty/dmlc-core/libdmlc.a -lpthread && :
FAILED: : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0 -msse2
-std=c++11 -fopenmp -g  -pthread
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_json.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_optional.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_main.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_env.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_thread_group.cc.o
-o 3rdparty/dmlc-core/test/unittest/dmlc_unit_tests  -rdynamic
lib/libgtestd.a 3rdparty/dmlc-core/libdmlc.a -lpthread && :
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o:
In function `Logging_basics_Test::TestBody()':
/home/ubuntu/experimentals/master_mxnet/build/../3rdparty/dmlc-core/test/unittest/unittest_logging.cc:19:
undefined reference to `testing::internal::DeathTest::Create(char const*,
testing::internal::RE const*, char const*, int,
testing::internal::DeathTest**)'
collect2: error: ld returned 1 exit status

Anirudh

On Fri, May 3, 2019 at 8:04 AM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> No problem Damien, glad to have you helping us validating the release.
> Just wanted to make suer we have enough votes to pass the general vote (the
> next release step) and with Sheng I think we should.
>
> On Fri, May 3, 2019 at 7:52 AM Damien Stanton 
> wrote:
>
> > Ah, I misunderstood the binding/non-binding distinction. I am not a PPMC
> > member, so my vote is non-binding.
> >
> > Best,
> > Damien
> >
> > On Fri, May 3, 2019 at 3:19 AM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Hi Junru could you give a quick summary of the binding / non-binding

Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-30 Thread Anirudh Subramanian
Hi Tao,

I covered in the doc that it is specifically about inference. I can add
another section in FAQ to mention why INT8 quantization is not included.

Anirudh

On Tue, Apr 30, 2019 at 7:59 AM Lv, Tao A  wrote:

> Thank you Anirudh! I'm just a little surprised that when we talk about
> mixed precision model we don't talk about training, and when talk about
> inference, INT8 quantization is not mentioned~
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Tuesday, April 30, 2019 8:27 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: Proposal for Conversion from FP32 to Mixed Precision Models
>
> Hi Zach,
>
> I checked the QuantizeGraph pass and I think probably it can benefit from
> CSE pass to eliminate additional quantize/quantize_v2 nodes. Having said
> that, I think it may still be an overkill to add another NNVM pass to have
> a generic common subexpression elimination pass. Currently, this
> elimination logic takes only additional 3 to 6 lines of code in each of the
> two NNVM pass. Also, a generic common subexpression elimination has its own
> associated maintenance costs. I think it is better to continue with the
> current approach and revisit this need in the future as we add more NNVM
> passes.
>
> Anirudh
>
> On Mon, Apr 29, 2019 at 2:22 PM Anirudh Subramanian  >
> wrote:
>
> > Hi Zach,
> >
> > You raise an interesting point. Thank you for the pointer!
> >
> > Incorporating CSE pass comes with its own cost, and the advantage it
> > brings is to make the ReducePrecision nnvm pass more lightweight.
> > Since the amortized cost of the ReducePrecision pass is O(1) it
> > shouldn't matter much whether we  add it or not from performance point
> of view.
> >
> > From maintenance point of view, I would agree that separating these
> > two logics can be helpful if we have other such workflows which
> > require the original Pass followed by CSE pass. Currently, as far as I
> > know only the ReducePrecision pass using it. I will check to see if
> > CSE pass can benefit other NNVM pass also like quantization pass apart
> > from ReducePrecision, and will get back.
> >
> > Anirudh
> >
> > On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg
> > 
> > wrote:
> >
> >> I have one suggestion. In the current design, there are the
> >> additional maps from each input entry to each target casted entry
> >> dtype in order to avoid creating duplicate casts. Instead of creating
> >> these, another option is to use a general purpose Common
> >> Subexpression Elimination (CSE) [1] pass to apply afterwards. So, you
> >> would run the mixed precision pass which creates the duplicates and
> >> then the CSE pass which would remove all duplicates.
> >>
> >> This design is common in existing compilers like LLVM because
> >> maintaining and testing the passes is much easier when they are kept
> >> as simple as possible. The CSE can also be reused as necessary for
> >> other passes that could create duplicates or to remove duplicate
> expressions in general.
> >> This
> >> tutorial [2] talks about it a bit.
> >>
> >> Zach
> >>
> >> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination
> >> [2] - https://blog.regehr.org/archives/1603
> >>
> >> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian <
> >> anirudh2...@gmail.com>
> >> wrote:
> >>
> >> > Hi Tao,
> >> >
> >> > Thanks for raising this question! I thought about the existing
> >> quantization
> >> > workflow and whether it can be included with the AMP API. Although
> >> > quantization can be considered as mixed precision, there are
> >> differences.
> >> > For example, only a small number of operators can be quantized
> >> > compared
> >> to
> >> > the operators that can run in FP16 precision. Thus, overriding the
> >> > operators to run in original dtype vs target dtype doesnt make much
> >> sense
> >> > for quantization.
> >> >
> >> > Also, quantization workflow may require a calibration dataset to
> >> calibrate
> >> > the min and max and calib_mode.
> >> > Arriving at a common API, for quantization with calibration and
> >> > mixed precision inference (FP16 and BF16) may make the API too
> >> > complicated and not very easy to use. I understand that this may
> >> > cause some confusion as people may try to use target_dtype of in

Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-30 Thread Anirudh Subramanian
Hi Zach,

I checked the QuantizeGraph pass and I think probably it can benefit from
CSE pass to eliminate additional quantize/quantize_v2 nodes. Having said
that, I think it may still be an overkill to add another NNVM pass to have
a generic common subexpression elimination pass. Currently, this
elimination logic takes only additional 3 to 6 lines of code in each of the
two NNVM pass. Also, a generic common subexpression elimination has its own
associated maintenance costs. I think it is better to continue with the
current approach and revisit this need in the future as we add more NNVM
passes.

Anirudh

On Mon, Apr 29, 2019 at 2:22 PM Anirudh Subramanian 
wrote:

> Hi Zach,
>
> You raise an interesting point. Thank you for the pointer!
>
> Incorporating CSE pass comes with its own cost, and the advantage it
> brings is to make the ReducePrecision nnvm pass more lightweight. Since the
> amortized cost of the ReducePrecision pass is O(1) it shouldn't matter much
> whether we  add it or not from performance point of view.
>
> From maintenance point of view, I would agree that separating these two
> logics can be helpful if we have other such workflows which require the
> original Pass followed by CSE pass. Currently, as far as I know only the
> ReducePrecision pass using it. I will check to see if CSE pass can benefit
> other NNVM pass also like quantization pass apart from ReducePrecision, and
> will get back.
>
> Anirudh
>
> On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg 
> wrote:
>
>> I have one suggestion. In the current design, there are the additional
>> maps
>> from each input entry to each target casted entry dtype in order to avoid
>> creating duplicate casts. Instead of creating these, another option is to
>> use a general purpose Common Subexpression Elimination (CSE) [1] pass to
>> apply afterwards. So, you would run the mixed precision pass which creates
>> the duplicates and then the CSE pass which would remove all duplicates.
>>
>> This design is common in existing compilers like LLVM because maintaining
>> and testing the passes is much easier when they are kept as simple as
>> possible. The CSE can also be reused as necessary for other passes that
>> could create duplicates or to remove duplicate expressions in general.
>> This
>> tutorial [2] talks about it a bit.
>>
>> Zach
>>
>> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination
>> [2] - https://blog.regehr.org/archives/1603
>>
>> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian <
>> anirudh2...@gmail.com>
>> wrote:
>>
>> > Hi Tao,
>> >
>> > Thanks for raising this question! I thought about the existing
>> quantization
>> > workflow and whether it can be included with the AMP API. Although
>> > quantization can be considered as mixed precision, there are
>> differences.
>> > For example, only a small number of operators can be quantized compared
>> to
>> > the operators that can run in FP16 precision. Thus, overriding the
>> > operators to run in original dtype vs target dtype doesnt make much
>> sense
>> > for quantization.
>> >
>> > Also, quantization workflow may require a calibration dataset to
>> calibrate
>> > the min and max and calib_mode.
>> > Arriving at a common API, for quantization with calibration and mixed
>> > precision inference (FP16 and BF16) may make the API too complicated and
>> > not very easy to use. I understand that this may cause some confusion as
>> > people may try to use target_dtype of int8 but I think its still better
>> > than causing user confusion with the API usage.
>> >
>> > Also, when we move quantize_model APIs outside contrib we can consider
>> > adding them under AMP namespace. The challenge would then be to educate
>> > users on difference between "quantize" and "convert".
>> >
>> > Anirudh
>> >
>> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A  wrote:
>> >
>> > > Thank you for the explanation. Sorry I didn't realize the proposal is
>> for
>> > > inference only.
>> > >
>> > > Then how do you think the amp_cast and amp_multicast in this proposal
>> can
>> > > work with the existing INT8 quantization workflow which I think should
>> > also
>> > > be considered as 'mixed precision'.
>> > >
>> > > -Original Message-
>> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
>> > > Sent: Monday, April 29, 2019 10:25 PM
>> > > To: dev@

Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi Zach,

You raise an interesting point. Thank you for the pointer!

Incorporating CSE pass comes with its own cost, and the advantage it brings
is to make the ReducePrecision nnvm pass more lightweight. Since the
amortized cost of the ReducePrecision pass is O(1) it shouldn't matter much
whether we  add it or not from performance point of view.

>From maintenance point of view, I would agree that separating these two
logics can be helpful if we have other such workflows which require the
original Pass followed by CSE pass. Currently, as far as I know only the
ReducePrecision pass using it. I will check to see if CSE pass can benefit
other NNVM pass also like quantization pass apart from ReducePrecision, and
will get back.

Anirudh

On Mon, Apr 29, 2019 at 11:18 AM Zach Kimberg 
wrote:

> I have one suggestion. In the current design, there are the additional maps
> from each input entry to each target casted entry dtype in order to avoid
> creating duplicate casts. Instead of creating these, another option is to
> use a general purpose Common Subexpression Elimination (CSE) [1] pass to
> apply afterwards. So, you would run the mixed precision pass which creates
> the duplicates and then the CSE pass which would remove all duplicates.
>
> This design is common in existing compilers like LLVM because maintaining
> and testing the passes is much easier when they are kept as simple as
> possible. The CSE can also be reused as necessary for other passes that
> could create duplicates or to remove duplicate expressions in general. This
> tutorial [2] talks about it a bit.
>
> Zach
>
> [1] - https://en.wikipedia.org/wiki/Common_subexpression_elimination
> [2] - https://blog.regehr.org/archives/1603
>
> On Mon, Apr 29, 2019 at 9:26 AM Anirudh Subramanian  >
> wrote:
>
> > Hi Tao,
> >
> > Thanks for raising this question! I thought about the existing
> quantization
> > workflow and whether it can be included with the AMP API. Although
> > quantization can be considered as mixed precision, there are differences.
> > For example, only a small number of operators can be quantized compared
> to
> > the operators that can run in FP16 precision. Thus, overriding the
> > operators to run in original dtype vs target dtype doesnt make much sense
> > for quantization.
> >
> > Also, quantization workflow may require a calibration dataset to
> calibrate
> > the min and max and calib_mode.
> > Arriving at a common API, for quantization with calibration and mixed
> > precision inference (FP16 and BF16) may make the API too complicated and
> > not very easy to use. I understand that this may cause some confusion as
> > people may try to use target_dtype of int8 but I think its still better
> > than causing user confusion with the API usage.
> >
> > Also, when we move quantize_model APIs outside contrib we can consider
> > adding them under AMP namespace. The challenge would then be to educate
> > users on difference between "quantize" and "convert".
> >
> > Anirudh
> >
> > On Mon, Apr 29, 2019 at 7:45 AM Lv, Tao A  wrote:
> >
> > > Thank you for the explanation. Sorry I didn't realize the proposal is
> for
> > > inference only.
> > >
> > > Then how do you think the amp_cast and amp_multicast in this proposal
> can
> > > work with the existing INT8 quantization workflow which I think should
> > also
> > > be considered as 'mixed precision'.
> > >
> > > -Original Message-
> > > From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> > > Sent: Monday, April 29, 2019 10:25 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Re: Proposal for Conversion from FP32 to Mixed Precision
> Models
> > >
> > > Hi Tao,
> > >
> > > The APIs proposed: "convert_model" and "convert_block" are mainly for
> > > inference use cases, where customers bring a FP32 model to convert it
> to
> > a
> > > mixed precision model to get improved performance while not losing out
> on
> > > the accuracy.
> > > The PR: https://github.com/apache/incubator-mxnet/pull/14173 is
> supposed
> > > to handle the training use cases and this proposal doesn't cover the
> AMP
> > > feature added in the PR. I think ptrendx@ and canoerst@ are better
> > > equipped to answer questions 1 and 2.
> > >
> > > > - more generally, what will be saved when users want to serialize
> > > > their
> > > model to disk?
> > >
> > > Lets say users want to save converted mixed precision model used for
> > 

Re: Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi Tao,

The APIs proposed: "convert_model" and "convert_block" are mainly for
inference use cases, where customers bring a FP32 model to convert it to a
mixed precision model to get improved performance while not losing out on
the accuracy.
The PR: https://github.com/apache/incubator-mxnet/pull/14173 is supposed to
handle the training use cases and this proposal doesn't cover the AMP
feature added in the PR. I think ptrendx@ and canoerst@ are better equipped
to answer questions 1 and 2.

> - more generally, what will be saved when users want to serialize their
model to disk?

Lets say users want to save converted mixed precision model used for
inference to disk. It will save both, the symbol with the amp_cast and
amp_multicast operators and the params (which are casted if necessary).

Anirudh


On Mon, Apr 29, 2019 at 6:55 AM Lv, Tao A  wrote:

> Thank you for sharing this, Anirudh.
>
> Curious to know:
> - what will be saved in a training checkpoint or snapshot? Can it be
> resumed on another platform which might not support the lower precision the
> previous one used?
> - what will be saved in the final symbol.json and params file when
> training is finished?
> - more generally, what will be saved when users want to serialize their
> model to disk?
>
> Thank you,
> -tao
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Monday, April 29, 2019 7:00 PM
> To: dev@mxnet.incubator.apache.org
> Subject: Proposal for Conversion from FP32 to Mixed Precision Models
>
> Hi all,
>
> I have created a doc for conversion from FP32 to Mixed Precision Models:
>
> https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models
>
> I look forward to your feedback on the same.
>
> Thanks,
> Anirudh
>


Proposal for Conversion from FP32 to Mixed Precision Models

2019-04-29 Thread Anirudh Subramanian
Hi all,

I have created a doc for conversion from FP32 to Mixed Precision Models:
https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models

I look forward to your feedback on the same.

Thanks,
Anirudh


[Announcement] New Committer - Wang Jiajun

2019-04-16 Thread Anirudh Subramanian
Hi,

Please join me to welcome Wang Jiajun (https://github.com/arcadiaphy) as a
new committer of Apache (incubating) MXNet!

Wang has been solving some tough bugs with respect to memory leaks, process
fork handling, dependency engine issues and custom op exception handling.

Issue Involvement:
https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aarcadiaphy

PRs authored:
https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aarcadiaphy+

Anirudh


Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
Hi Marco,

The backend private APIs in engine, executor, storage, ndarray etc. can
still be changed.
I understand that it may introduce code duplication, but introducing
duplicate C APIs can still be better than the backend
developer having to worry about different frontends. Not to mention a
frontend which is not yet merged to the
repo but in its own repo, these repos should also be considered consumers
of MXNet API.

Anirudh

On Thu, Apr 11, 2019 at 12:12 PM Marco de Abreu 
wrote:

> Good point about the adoption speed for the different frontends, Anirudh.
> While this is a quite valid argument, I'm afraid of the complexity it might
> introduce as well as a risk of further diverging frontend functionality.
>
> I'd rather propose that we introduce a guideline to follow when changes to
> C-APIs are being made. Part of that could be starting a thread like this
> one that lays down the changes that are being made to the C-API. We could
> then coordinate the changes to the different frontends and gather people
> from the community who feel comfortable to do the changes in the respective
> frontends. If nobody speaks up, the original proposer of that change could
> be responsible to do the necessary changes.
>
> An adjacent topic for this discussion could be test coverage: We currently
> have no tools to determine which frontend hits which C-API and where
> changes have to be made. This might be a topic we should spark up again
> separately.
>
> -Marco
>
> On Thu, Apr 11, 2019 at 8:55 PM Marco de Abreu 
> wrote:
>
> > My personal opinion towards that discussion is that we should keep the
> > C-API free from semantic versioning because otherwise we're introducing
> two
> > "fronts" that we have to maintain backwards compatibility for. By the
> way,
> > currently, we have no way to verify and guarantee the compatibility of
> the
> > C-API. The major issue I'd see with adding SemVer for the C-API is that
> > this would increase the complexity of changes that are (in my opinion)
> > entirely internal to MXNet by introducing another thing that developers
> > would have to look out for - possibly introducing code duplication as
> > described by Jun while not providing any clear benefits to me.
> >
> > If there is a use-case where people can not even use our C++ package,
> then
> > we could have discussions about introducing a user-facing C-API, but
> right
> > now this approach to interface with our C-API (although I know that
> people
> > use it) seem a bit like using undocumented Windows APIs: They work, but
> > it's on your own risk, they might always break and there's no guarantee.
> >
> > -Marco
> >
> > On Thu, Apr 11, 2019 at 8:52 PM Anirudh Subramanian <
> anirudh2...@gmail.com>
> > wrote:
> >
> >> Hi Jun,
> >>
> >> Till now from what I have observed this has been an undocumented
> guideline
> >> to not break C APIs (example:
> >>
> https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999
> >> ).
> >> Although the C APIs are supposed to serve only as bridges for frontend
> >> language bindings (exception being C Predict API), I think there are 3rd
> >> party libraries like Horovod which are starting to
> >> depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .
> >>
> >> Also, since MXNet has a lot of frontend bindings ensuring backward
> >> compatibility with semver can help frontend bindings adopt the new APIs
> at
> >> their own pace.
> >>
> >> Anirudh
> >>
> >>
> >> On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:
> >>
> >> > I'm not sure about whether C APIs should fall under semver. This is
> the
> >> > discussion we would like to have with the community.
> >> >
> >> > My thinking on this:
> >> > 1. In most of the cases, C APIs only serve as bridges between frontend
> >> > language bindings and C++ backend. Most of users/developers do not
> >> interact
> >> > directly with C APIs.
> >> > 2. The cases I can think of where C APIs are directly adopted in
> >> > application development are model deployment in a C/C++ environment.
> In
> >> > those cases, developers only interact with C Predict APIs, which we
> >> didn't
> >> > touch.
> >> >
> >> > If the community feel that we are obliged to keep the semver for all C
> >> > APIs, we can try to make a copy of the C APIs we intend to modify in
> >> the PR
> >> > and keep the old signatures intact, t

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
Hi Jun,

Till now from what I have observed this has been an undocumented guideline
to not break C APIs (example:
https://github.com/apache/incubator-mxnet/pull/11429#discussion_r199564999).
Although the C APIs are supposed to serve only as bridges for frontend
language bindings (exception being C Predict API), I think there are 3rd
party libraries like Horovod which are starting to
depend on it (https://github.com/apache/incubator-mxnet/pull/14615) .

Also, since MXNet has a lot of frontend bindings ensuring backward
compatibility with semver can help frontend bindings adopt the new APIs at
their own pace.

Anirudh


On Thu, Apr 11, 2019 at 10:58 AM Jun Wu  wrote:

> I'm not sure about whether C APIs should fall under semver. This is the
> discussion we would like to have with the community.
>
> My thinking on this:
> 1. In most of the cases, C APIs only serve as bridges between frontend
> language bindings and C++ backend. Most of users/developers do not interact
> directly with C APIs.
> 2. The cases I can think of where C APIs are directly adopted in
> application development are model deployment in a C/C++ environment. In
> those cases, developers only interact with C Predict APIs, which we didn't
> touch.
>
> If the community feel that we are obliged to keep the semver for all C
> APIs, we can try to make a copy of the C APIs we intend to modify in the PR
> and keep the old signatures intact, this will introduce a lot of duplicate
> code though.
>
> On Thu, Apr 11, 2019 at 8:50 AM Anirudh Subramanian  >
> wrote:
>
> > I was under the impression that C API does fall under semver. Has this
> been
> > discussed somewhere before ? Is this also the case for C Predict API ?
> >
> > On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
> > wrote:
> >
> > > In case only changes to the c-api are being made, it doesn't fall under
> > our
> > > semantic versioning since that's not a user facing API and thus I'd be
> in
> > > favour as doing it as part of a minor release. If there is any
> > behavioural
> > > change from a user perspective (a good indicator would be if tests have
> > to
> > > be changed as reaction to the Backend changes), then I'd prefer a major
> > > release.
> > >
> > > I'd slightly prefer a minor release since this change touches quite a
> few
> > > parts and could risk being outdated/diverged as the time until 2.0
> > > progresses.
> > >
> > > -Marco
> > >
> > > Aaron Markham  schrieb am Do., 11. Apr.
> 2019,
> > > 16:28:
> > >
> > > > Just curious about when this kind of change will land. Would it wait
> > for
> > > > 2.0 or would it be in 1.5 or another minor release?
> > > >
> > > > On Thu, Apr 11, 2019, 00:15 Junru Shao 
> > wrote:
> > > >
> > > > > Really nice improvement over MXNet's usability! I suggest that we
> > could
> > > > > make numpy-compatible behavior default in 2.0.
> > > > >
> > > > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu 
> wrote:
> > > > >
> > > > > > Dear Community,
> > > > > >
> > > > > > A while ago, we sent out an RFC
> > > > > > <https://github.com/apache/incubator-mxnet/issues/14253>
> > discussing
> > > > the
> > > > > > initiative introducing NumPy compatibility into MXNet. As the
> first
> > > > > outcome
> > > > > > of this initiative, we submitted the PR
> > > > > > <https://github.com/apache/incubator-mxnet/pull/14661> providing
> > the
> > > > > > infrastructure of supporting zero-dim (scalar) and zero-size
> > tensors,
> > > > > which
> > > > > > have been long-missing in MXNet.
> > > > > >
> > > > > > In our implementation, we have put the best efforts of keeping
> the
> > > > > promise
> > > > > > of backward compatibility in all the language bindings.
> > Nevertheless,
> > > > we
> > > > > > still would like to call out the changes explicitly that may
> impact
> > > > your
> > > > > > existing codebases developed on top of MXNet by calling C-APIs
> > > directly
> > > > > or
> > > > > > implementing operators in your own repos.
> > > > > >
> > > > > > 1. In you application, if you called any one of the following
> > > > > shape-related
> > > > > > C-APIs, 

Re: Implementing zero-dim and zero-size tensors in MXNet and its impact on your codebases

2019-04-11 Thread Anirudh Subramanian
I was under the impression that C API does fall under semver. Has this been
discussed somewhere before ? Is this also the case for C Predict API ?

On Thu, Apr 11, 2019, 8:08 AM Marco de Abreu 
wrote:

> In case only changes to the c-api are being made, it doesn't fall under our
> semantic versioning since that's not a user facing API and thus I'd be in
> favour as doing it as part of a minor release. If there is any behavioural
> change from a user perspective (a good indicator would be if tests have to
> be changed as reaction to the Backend changes), then I'd prefer a major
> release.
>
> I'd slightly prefer a minor release since this change touches quite a few
> parts and could risk being outdated/diverged as the time until 2.0
> progresses.
>
> -Marco
>
> Aaron Markham  schrieb am Do., 11. Apr. 2019,
> 16:28:
>
> > Just curious about when this kind of change will land. Would it wait for
> > 2.0 or would it be in 1.5 or another minor release?
> >
> > On Thu, Apr 11, 2019, 00:15 Junru Shao  wrote:
> >
> > > Really nice improvement over MXNet's usability! I suggest that we could
> > > make numpy-compatible behavior default in 2.0.
> > >
> > > On Wed, Apr 10, 2019 at 11:34 PM Jun Wu  wrote:
> > >
> > > > Dear Community,
> > > >
> > > > A while ago, we sent out an RFC
> > > >  discussing
> > the
> > > > initiative introducing NumPy compatibility into MXNet. As the first
> > > outcome
> > > > of this initiative, we submitted the PR
> > > >  providing the
> > > > infrastructure of supporting zero-dim (scalar) and zero-size tensors,
> > > which
> > > > have been long-missing in MXNet.
> > > >
> > > > In our implementation, we have put the best efforts of keeping the
> > > promise
> > > > of backward compatibility in all the language bindings. Nevertheless,
> > we
> > > > still would like to call out the changes explicitly that may impact
> > your
> > > > existing codebases developed on top of MXNet by calling C-APIs
> directly
> > > or
> > > > implementing operators in your own repos.
> > > >
> > > > 1. In you application, if you called any one of the following
> > > shape-related
> > > > C-APIs, you will need to change the data type of shape's ndim and
> > > dim_size
> > > > from *unsigned int* to signed *int*, because we have to use -1 to
> > > represent
> > > > unknown shape information, and reserve 0 for scalar and zero-size
> > > tensors.
> > > > One example of such changes can be seen in the cpp-package
> > > > <
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-c0e1fcfe1619faa4ff5f59d94e8bR183
> > > > >
> > > > calling MXSymbolInferShape.
> > > > - MXSymbolInfershape
> > > > - MXSymbolInfershapePartial
> > > > - MXExecutorSimpleBind
> > > > - MXExecutorReshape
> > > > - MXNDArrayGetShape
> > > > - MXNDArrayCreaetFromSharedMem
> > > >
> > > > 2. If you have implemented operators in your own codebases, you will
> > > > probably need to change every operator's shape inference function to
> > use
> > > > the following util functions to check whether shape information is
> > known,
> > > > instead of checking against 0 directly. One example of such changes
> can
> > > be
> > > > seen in the shape inference function
> > > > <
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14661/files#diff-afa640c4653c59f00f43a84455f91ef9R35
> > > > >
> > > > of concat operator.
> > > > - shape_is_known (include/mxnet/tuple.h)
> > > > - ndim_is_known (include/mxnet/tuple.h)
> > > > - dim_size_is_known (include/mxnet/tuple.h)
> > > >
> > > > If you are interested in knowing the value of scalar tensors, and
> hence
> > > > understanding our motivation further, this thread
> > > > <
> https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > >
> > > of
> > > > discussion provides very good insights from the view of data science.
> > It
> > > > was actually related to an opportunity for MXNet becoming the backend
> > of
> > > > PyMC , but somehow it didn't go
> > > > through due to missing several key features
> > > > ,
> and
> > > > scalar tensors is one of them.
> > > >
> > > > Please leave comments in the PR
> > > >  if you have
> any
> > > > concerns or suggestions of our work.
> > > >
> > > > Thank you very much for your time and consideration.
> > > >
> > > > Best,
> > > > Jun
> > > >
> > > > *References*
> > > > [1] RFC of NumPy compatibility:
> > > > https://github.com/apache/incubator-mxnet/issues/14253
> > > > [2] Pull request of supporting scalar and zero-size tensors:
> > > > https://github.com/apache/incubator-mxnet/pull/14661
> > > > [3] The value of scalar tensors from the view of data science:
> > > >
> https://discuss.mxnet.io/t/rank-0-arrays-in-mxnet-aka-pi-is-wrong/108
> > > > 

Re: assimilation of mshadow into the MXNet codebase

2019-04-05 Thread Anirudh Acharya
Hi Pedro,

mshadow is mostly used for tensor arithmetic. There have been discussions
about including it within mxnet. I think it is a good idea.

As a more long term solution using libraries like eigen to perform linear
algebra operations was also suggested by anirudh2290@. I think xtensor(
https://github.com/QuantStack/xtensor ) can also be a candidate here.

-
Anirudh


On Fri, Apr 5, 2019 at 7:03 PM Pedro Larroy 
wrote:

> Hi
>
> Some developers have noticed that working in mshadow is cumbersome as
> it's a 3rdparty subrepo.
>
> Since mshadow is a bunch of headers which don't have much of
> independent tests / library functionality, me and other developers
> believe that it would be good to assimilate this code in the
> repository for ease of contribution and changes without having to go
> trough contortions to test PRs that modify mshadow.
>
> Would anybody oppose this change?
>
> Thanks and have a nice weekend.
>
> Pedro.
>


Re: Include R-package

2019-04-01 Thread Anirudh Acharya
There was a conversation on this some time back here -
https://lists.apache.org/list.html?d...@mxnet.apache.org:2018-12:Rcpp%20licensing%20in%20Apache%20MXNet


-
Anirudh


On Mon, Apr 1, 2019 at 12:19 PM Zach Kimberg 
wrote:

> As part of the current MXNet release process, the R-package is removed from
> the source release [1]. If we are advertising that MXNet has an R package
> as an Apache project, it really should be part of the official Apache
> release process. I know there were a few missing license headers within the
> package as it is currently excluded from the license check [2]. If someone
> fixes those, are there any other reasons why it can't or shouldn't be
> released?
>
> Zach
>
>
>
> [1] - https://cwiki.apache.org/confluence/display/MXNET/Release+Process
> [2] -
>
> https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/apache_rat_license_check/rat-excludes#L9
>


[Announcement] New Committer - Alex Zai

2019-03-31 Thread Anirudh Subramanian
Hi all,

Please join me to welcome Alex Zai as a new committer of Apache
(incubating) MXNet!

Alex has been instrumental in brining MKLDNN from experimental to making it
default on MXNet master. This involved adding Python and C++ unit tests,
improving CI coverage for MKLDNN, testing MKLDNN on different platforms and
working on issues related to MKLDNN.

PRs:
https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aazai91+

Issues:
https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Aazai91

Reviews:
https://github.com/apache/incubator-mxnet/pulls?page=1=is%3Apr+reviewed-by%3Aazai91=%E2%9C%93

Dev:
https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:azai91

Thanks,

Anirudh


Re: R help

2019-03-25 Thread Anirudh Acharya
Yes, that is the error, need to dig deeper why that URL is not working.


Thanks
Anirudh


On Mon, Mar 25, 2019 at 10:40 AM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Is this the error?
> "test_model.R:129: error: Fine-tune
>
> cannot open URL
> 'http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> '
> 1: GetInception() at R-package/tests/testthat/test_model.R:129
> 2: download.file("
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> ",
>destfile = "model/Inception-BN-0126.params")"
>
> Looks like the
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> is failing for me as well.
>
>
> On Mon, Mar 25, 2019 at 10:37 AM Anirudh Acharya 
> wrote:
>
> > Hi Per da Silva,
> >
> > Let me know if I can help, we can chat offline.
> >
> > From first glance it would seem
> >
> >- R:MKLDNN CPU is passing whereas R:CPU is failing
> >- R:GPU might have failed due to this "cannot open URL '
> >
> >
> http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
> >'"
> >
> >
> > Thanks
> > Anirudh
> >
> >
> > On Mon, Mar 25, 2019 at 7:34 AM Per da Silva 
> wrote:
> >
> > > Dear community,
> > >
> > > I'm working on a PR <
> > https://github.com/apache/incubator-mxnet/pull/14513>
> > > to update CI GPU jobs to be based on CUDA v10. However, for some
> reason,
> > > amongst other things, the R tests are failing
> > > <
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-14513/4/pipeline
> > > >.
> > > I would really appreciate some help from the R experts to get it sorted
> > =D
> > >
> > > Thanks in advance,
> > >
> > > Per
> > >
> >
>


Re: R help

2019-03-25 Thread Anirudh Acharya
Hi Per da Silva,

Let me know if I can help, we can chat offline.

>From first glance it would seem

   - R:MKLDNN CPU is passing whereas R:CPU is failing
   - R:GPU might have failed due to this "cannot open URL '
   http://data.dmlc.ml/models/imagenet/inception-bn/Inception-BN-0126.params
   '"


Thanks
Anirudh


On Mon, Mar 25, 2019 at 7:34 AM Per da Silva  wrote:

> Dear community,
>
> I'm working on a PR <https://github.com/apache/incubator-mxnet/pull/14513>
> to update CI GPU jobs to be based on CUDA v10. However, for some reason,
> amongst other things, the R tests are failing
> <
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-14513/4/pipeline
> >.
> I would really appreciate some help from the R experts to get it sorted =D
>
> Thanks in advance,
>
> Per
>


Re: [DISCUSS] Rebrand Gluon to MXNet imperative or something MXNet.

2019-03-22 Thread Anirudh Acharya
I have also faced this problem, when talking to someone external( at
meetups etc.. ) using two names like gluon and mxnet gets confusing and
people usually have not heard of gluon.

I get around it by referring to gluon as "gluon-mxnet" while talking to
anyone outside the community.


-
Anirudh



On Fri, Mar 22, 2019 at 4:02 PM Pedro Larroy 
wrote:

> Hi dev@
>
> We heard feedback from users that the Gluon name is confusing. Some of
> them don't even know it's MXNet and it's unclear the relationship with
> MXNet
>
> Would it make sense to rebrand Gluon to just MXNet or MXNet
> imperative? Diluting brands and names is never a good idea.
>
> There's also gluonhq which is related to JavaFX which adds to the
> confusion, search engine friendliness is not high as well.
>
> Pedro.
>


[Announcement] New Committer - Patric Zhao

2019-03-14 Thread Anirudh Subramanian
Hi all,

Please join me to welcome Patric Zhao as a new committer of Apache
(incubating) MXNet!

Patric has put in great effort around MKLDNN integration into MXNet and has
been involved in features like quantization, graph fusion and fused RNN
operators for CPU.

Dev List activity:
https://lists.apache.org/list.html?d...@mxnet.apache.org:lte=3y:patric.zhao

Issues:
https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93=is%3Aissue+involves%3Apengzhao-intel+

PR Reviews:
https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+reviewed-by%3Apengzhao-intel

Proposals involved in:
https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN
https://cwiki.apache.org/confluence/display/MXNET/Fused+RNN+Operators+for+CPU
<https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN>


Thanks,
Anirudh


HPO for MXNet models

2019-03-13 Thread Anirudh Acharya
Hi All,

I posted this earlier on the mxnet slack channel, based on a suggestion
there I am reposting it here for a wider audience -

I was searching for ways of performing HPO for models built with MXNet, and
I came across Sherpa, an open source distributed HPO library presented in
NeurIPS 2018 - https://openreview.net/pdf?id=HklSUMyJcQ.

I have been trying it out and it is very easy to use and extensible. It
already supports RandomSearch, Grid Search and BayesianOpt for performing
the search in the hyper-parameter space.

I have submitted a PR with an example gluon use-case -
https://github.com/sherpa-ai/sherpa/pull/27  But I am yet to try it with
large distributed training use cases. But the library does support it, we
can run it in distributed mode for running heavy workloads.

It also comes with a neat UI dashboard to monitor the jobs being run.

[image: Screen Shot 2019-03-13 at 8.08.48 AM.png]

I think we should explore this as an option for performing HPO with gluon.

What might integration entail -
1. I have not fully evaluated what changes might be necessary but I think
the integration can be fairly unobtrusive for both repositories. As
demonstrated above we can already use sherpa for performing HPO, but the
experience is a bit clunky. It can be made smooth by adding a few callback
functions that will track and log the metrics of the different experiment
runs( å la the keras callback function defined here -
https://github.com/sherpa-ai/sherpa/blob/master/sherpa/core.py#L368 )

2. The library is developed and maintained by folks in academia and is
published under GPL license. I was given to understand that GPL license
might be a problem for Apache products, but since we are not explicitly
using it within mxnet as a sub-component, I am thinking we might have some
wiggle room there.

MXNet needs HPO functionality and instead of building something from
scratch we could just use existing open source projects. Would like to hear
more from the community.

Thanks
Anirudh Acharya


Re: Call for Ideas and Approaches to Community Building

2019-03-06 Thread Anirudh Acharya
Having only non-organization PMC members nominate new committers could
un-level the playing field.
1. Many times contributions might not require a contributor to have direct
1:1 discussion with PMC members outside his org.
2. It would give inordinate power/responsibility to the few non-Amazon
active PMC members.

Just my 2 cents.


Thanks
Anirudh

On Wed, Mar 6, 2019 at 7:10 AM Isabel Drost-Fromm  wrote:

>
>
> Am 2. März 2019 15:13:23 MEZ schrieb Carin Meier :
> >I wanted to kickoff a discussion about community building. There was an
> >excellent blog post from the Apache Beam Community on this
> >https://blogs.apache.org/comdev/entry/an-approach-to-community-building
>
> Needless to say I really love that blog post.
>
> Other than that there are a couple of question you might ask yourself as a
> community:
>
> How easy is it to patriciate as an outsider, how much communication is
> happening outside of dev@?
>
> How explicit are you about how inclusive you want to be? See also
> https://youtu.be/LgB1s3buccI
>
> How explicit are you about where you need help (including and beyond
> coding)?
>
> How explicit are you with downstream users that some of the inner workings
> of Apache projects are build around a scratch your own itch casual
> contributions that ideally should be rewarded the same way as full-time
> contributions (
> https://blogs.apache.org/foundation/entry/success-at-apache-for-love )?
>
> I think you want to enable as many ppl as possible - typically only ten
> percent of your users turn into contributor, of those only ten percent
> trend to be repeat contributors... at least in my experience.
>
> Just some ideas,
> Isabel
> --
> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>


Re: [DISCUSS] Process to remove deprecated operators

2019-02-28 Thread Anirudh Acharya
Hi Lin,

This is a good idea. Here is an issue -
https://github.com/apache/incubator-mxnet/issues/9686 that is already
attempting to collate all the breaking changes that might be necessary for
v2.0. We could start by adding things to that issue.

I think eventually we will need a separate branch into which these breaking
changes get introduced, and this branch can later be merged into master
prior to v2.0 release.

Thanks
Anirudh


On Thu, Feb 28, 2019 at 1:35 PM Wen-Yang Chu  wrote:

> Hi,
>
> I have raised an issue:
>
> mx.nd.Crop does not support FP16 and decpreciated but no direct alternative
> with central crop
> I use this operator to implement Unet and I found other people too on the
> Internent. It is very inconvenient to remove this specific operator
> withoit clear alternative for me:
>
> https://github.com/apache/incubator-mxnet/issues/13750
>
> *Is it possible to review depreciated operators to make sure we have
> equivalent functionality?*
> Thanks
>
> Wen-Yang
>
> On Thu, Feb 28, 2019 at 2:07 PM Chaitanya Bapat 
> wrote:
>
> > This sounds good.
> > Going further, if we can maintain a list of deprecated operators - we can
> > create a "Good for first contribution" issue to improve log messaging of
> > Deprecated operators.
> > If it makes sense, I can go ahead and create that.
> >
> > Hope this helps.
> >
> > On Thu, 28 Feb 2019 at 01:54, Lin Yuan  wrote:
> >
> > > Agreed. When we deprecate an operator, we should add in the log message
> > > something like "This operator X is deprecate and will be removed in the
> > > next release. Please use operator Y instead."
> > >
> > > Lin
> > >
> > > On Wed, Feb 27, 2019 at 10:23 PM Junru Shao 
> > > wrote:
> > >
> > > > Hi Lin,
> > > >
> > > > I would love to share some immature ideas about deprecating
> operators.
> > > Not
> > > > only adopting semantic versioning, but also should we provide enough
> > > > informative error message for customers to understand how to replace
> > > > deprecated operators with new ones.
> > > >
> > > > Thanks,
> > > > Junru
> > > >
> > > > On Wed, Feb 27, 2019 at 9:30 PM Lin Yuan 
> wrote:
> > > >
> > > > > Sheng,
> > > > >
> > > > > Thanks for your quick response.
> > > > > If that's the case, we will wait till 2.0 release to remove the
> > > > deprecated
> > > > > operators from code.
> > > > >
> > > > > Best,
> > > > > Lin
> > > > >
> > > > > On Wed, Feb 27, 2019 at 9:06 PM Sheng Zha 
> > wrote:
> > > > >
> > > > > > MXNet follows semantic versioning so we will be able to delete
> them
> > > in
> > > > > the
> > > > > > next major release.
> > > > > >
> > > > > > -sz
> > > > > >
> > > > > > On Wed, Feb 27, 2019 at 8:53 PM Lin Yuan 
> > > wrote:
> > > > > >
> > > > > > > Dear Community,
> > > > > > >
> > > > > > > In MXNet there are many legacy operators such as this
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://mxnet.incubator.apache.org/versions/master/api/python/symbol/symbol.html?highlight=convolution_v1#mxnet.symbol.Convolution_v1
> > > > > > > >
> > > > > > > that has been marked DEPRECATE for several releases. However,
> > these
> > > > > > > operators still exist in our code. This caused a few problems:
> > > > > > >
> > > > > > > 1) Make the codebase bloated and reduce readability
> > > > > > > 2) Increase unnecessary maintanence effort
> > > > > > > 3) Bug prone as some people will look up these legacy code as
> > > example
> > > > > > > 4) Cause confusion to end users and make documentation page
> > lengthy
> > > > > > >
> > > > > > > I would like to propose the following process (if there is no
> > > > existing
> > > > > > one)
> > > > > > > to remove deprecate operators from our code base.
> > > > > > >
> > > > > > > 1. Documnent the deprecate operators/environme

Re: [VOTE] Release Apache MXNet (incubating) version 1.4.0.rc2

2019-02-04 Thread Anirudh Subramanian
-0

Thanks Steffen for your release efforts !

Build from source works with make but fails with cmake for me.

 cd build && cmake VERBOSE=1 -DUSE_CUDA=ON -DUSE_CUDNN=ON -DUSE_OPENMP=ON
-DCMAKE_BUILD_TYPE=Debug -DUSE_DIST_KVSTORE=0 -DUSE_OPENCV=1 -GNinja .. &&
ninja -v

FAILED: : && /usr/bin/c++   -Wall -Wno-unknown-pragmas -fPIC -g -O0 -msse2
-std=c++11 -fopenmp -g  -pthread
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_lockfree.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_param.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_parser.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_array_view.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_any.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_config.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_serializer.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_threaditer_exc_handling.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_inputsplit.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_json.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_optional.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_main.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_env.cc.o
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_thread_group.cc.o
-o 3rdparty/dmlc-core/test/unittest/dmlc_unit_tests  -rdynamic
lib/libgtestd.a 3rdparty/dmlc-core/libdmlc.a -lpthread && :
3rdparty/dmlc-core/test/unittest/CMakeFiles/dmlc_unit_tests.dir/unittest_logging.cc.o:
In function `Logging_basics_Test::TestBody()':
/home/ubuntu/experimentals/1.4_release/build/../3rdparty/dmlc-core/test/unittest/unittest_logging.cc:19:
undefined reference to `testing::internal::DeathTest::Create(char const*,
testing::internal::RE const*, char const*, int,
testing::internal::DeathTest**)'
collect2: error: ld returned 1 exit status


Anirudh

On Mon, Feb 4, 2019 at 3:09 PM Haibin Lin  wrote:

> +1 built from source on Linux and passed dist sync kvstore test.
>
> On Mon, Feb 4, 2019 at 9:54 AM Lin Yuan  wrote:
>
> > +1 build from source on MacOS 10.13.6 and tested mxnet-to-coreml
> converter.
> >
> > On Mon, Feb 4, 2019 at 9:03 AM Indhu  wrote:
> >
> > > +1
> > >
> > > Build from source and tested few examples from the examples folder.
> > >
> > > Thanks,
> > > Indu
> > >
> > >
> > >
> > > On Fri, Feb 1, 2019 at 6:21 PM Steffen Rochel  >
> > > wrote:
> > >
> > > > Hi Sheng - thanks for the feedback.
> > > > TVM notice  file is missing as the 1.4.x branch/v1.4.0 release is
> using
> > > TVM
> > > > commit 0f053c8
> > > > <
> > > >
> > >
> >
> https://github.com/dmlc/tvm/commit/0f053c82a747b4dcdf49570ec87c17e0067b7439
> > > > >
> > > >  from Oct 8, 2018, which didn't have the NOTICE file. IMHO, MXNet
> > NOTICE
> > > > file is consistent with release content.
> > > > As the release started in 2018 I do think it is ok to move forward
> w/o
> > > > update to 2019 IMHO.
> > > >
> > > > All -
> > > > thanks to the committers/contributors (Tao, Aaron, Kellen, Aston,
> Yuxi)
> > > who
> > > > tested and provided feedback - we have five +1 votes.
> > > > As of today, Friday Feb 1st 2019 6pm PST we have two binding votes,
> one
> > > +1
> > > > (Carin), one +0 (Sheng). The vote continues be open waiting for
> > feedback
> > > > from PMC members.
> > > > Hope you can spare some time over the weekend to provide feedback.
> > > >
> > > > Regards,
> > > > Steffen
> > > >
> > > > On Fri, Feb 1, 2019 at 12:44 AM Marco de Abreu <
> > marco.g.ab...@gmail.com>
> > > > wrote:
> > > >
> > > > > Considering the release process has been started last year and the
> > code
> > > > tag
> > > > > has also been based on last year, I'd say that it is not really a
> big
> > > > deal.
> > > > >
> > > > > -Marco
> > > > >
> > > > > Am Fr., 1. Feb. 2019, 09:33 hat Sheng Zha 
> > > > >

Re: [Announcement] New Committer -- Lin Yuan

2019-02-03 Thread Anirudh Acharya
Congratulations Lin

On Sat, Feb 2, 2019, 3:27 PM Tianqi Chen  Dear Community:
>
> Please join me to welcome Lin Yuan(@apeforest) as a new committer of
> Apache(incubating) MXNet!
>
> He has contributed to various improvements, including better compatibility
> of larger arrays across the codebase.
>
> Commits:
> https://github.com/apache/incubator-mxnet/commits?author=apeforest
>
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=is%3Apr+author%3Aapeforest
>
>
> Reviews:
> https://github.com/apache/incubator-mxnet/pulls?utf8=%
> E2%9C%93=reviewed-by%3Aapeforest
>
> dev@ activitivity
> https://lists.apache.org/list.html?*@mxnet.apache.org:lte=6M:Lin%20Yuan
>
> Tianqi
>


Re: [Question] UI change policy in MXNet

2018-12-20 Thread Anirudh Subramanian
On Thu, Dec 20, 2018, 1:56 PM Lin Yuan  Hi Anirudh,
>
> Thanks a lot for your clarifications! I have some followup
> questions/comments:
>
> 1) Which guideline should we follow when updating the UI in MXNet
> operators?
> A) MXNet follows semantic versioning, so breaking changes to operator
> interfaces can be introduced only in major versions.
>
> (Lin:) My question is what style of UI guide we should follow. e.g. naming
> convension, usage mode, etc. Something like numpy's style or tensorflow?
>
I don't think there is such an UI guide. If the operator already existed in
numpy/scipy or other frameworks we generally tend to use similar
interfaces.

>
> 2) Who should approve the UI change?
> A) Contributors who may have worked on the operator and/or other
> contributors/committers.
>
> (Lin:) Is it too local to reply on contributors to one/a few operators to
> decide the UI. How can we make sure the consistency of UI across all
> operators in MXNet?
>
agreed. Feel free to propose a better way.

>
> 3) In case of backward compatibility, should we favor breaking the backward
> compatibility and update the release notes or adding a newer version of the
> operator like ***_v2?
> A) If the operator interfaces are not compatible, its fine to create
> operator with the name "_v2" . In the next major version release, you can
> add an alias for newer implementation and deprecate the older one.
>
> (Lin) What if there is already "_v2", do we add "_v3", "_v4" as the project
> evolves?
>
This needs to be dealt on case by case basis. I haven't seen many ops which
would require three backward incompatible revisions between two major
releases.

>
> 4) Which operator should go to contrib and which be implemented as regular?
> A) I think this discussion may help:
> https://github.com/apache/incubator-mxnet/pull/5499 . To summarize:
> contrib
> was created for ops for which we provide limited guarantees with respect to
> backward compatibility, interface changes, testing etc.
>
> (Lin) This is definitely an informative discussion. It would be better if
> we can put this in a more noticeable place for developers.
>
>
> On Thu, Dec 20, 2018 at 1:39 PM Anirudh Subramanian  >
> wrote:
>
> > 1) Which guideline should we follow when updating the UI in MXNet
> > operators?
> > A) MXNet follows semantic versioning, so breaking changes to operator
> > interfaces can be introduced only in major versions.
> >
> > 2) Who should approve the UI change?
> > A) Contributors who may have worked on the operator and/or other
> > contributors/committers.
> >
> > 3) In case of backward compatibility, should we favor breaking the
> backward
> > compatibility and update the release notes or adding a newer version of
> the
> > operator like ***_v2?
> > A) If the operator interfaces are not compatible, its fine to create
> > operator with the name "_v2" . In the next major version release, you can
> > add an alias for newer implementation and deprecate the older one.
> >
> > 4) Which operator should go to contrib and which be implemented as
> regular?
> > A) I think this discussion may help:
> > https://github.com/apache/incubator-mxnet/pull/5499 . To summarize:
> > contrib
> > was created for ops for which we provide limited guarantees with respect
> to
> > backward compatibility, interface changes, testing etc.
> >
> > Anirudh
> >
> > On Thu, Dec 20, 2018 at 1:00 PM Lin Yuan  wrote:
> >
> > > Dear Community,
> > >
> > > As a contributor, I would like to know the current policy for updating
> UI
> > > of an operator. I understand UI change should be introduced in major
> > > release not minor release. However, it is still not quite clear to me
> > > regarding the UI change process:
> > >
> > > 1) Which guideline should we follow when updating the UI in MXNet
> > > operators?
> > > 2) Who should approve the UI change?
> > > 3) In case of backward compatibility, should we favor breaking the
> > backward
> > > compatibility and update the release notes or adding a newer version of
> > the
> > > operator like ***_v2?
> > > 4) Which operator should go to contrib and which be implemented as
> > regular?
> > >
> > > Any clarification is appreciated and it is helpful to guide PR
> reviewers
> > as
> > > well.
> > >
> > > Merry Christmas to ya'all!
> > >
> > > Lin
> > >
> >
>


Re: [Question] UI change policy in MXNet

2018-12-20 Thread Anirudh Subramanian
1) Which guideline should we follow when updating the UI in MXNet operators?
A) MXNet follows semantic versioning, so breaking changes to operator
interfaces can be introduced only in major versions.

2) Who should approve the UI change?
A) Contributors who may have worked on the operator and/or other
contributors/committers.

3) In case of backward compatibility, should we favor breaking the backward
compatibility and update the release notes or adding a newer version of the
operator like ***_v2?
A) If the operator interfaces are not compatible, its fine to create
operator with the name "_v2" . In the next major version release, you can
add an alias for newer implementation and deprecate the older one.

4) Which operator should go to contrib and which be implemented as regular?
A) I think this discussion may help:
https://github.com/apache/incubator-mxnet/pull/5499 . To summarize: contrib
was created for ops for which we provide limited guarantees with respect to
backward compatibility, interface changes, testing etc.

Anirudh

On Thu, Dec 20, 2018 at 1:00 PM Lin Yuan  wrote:

> Dear Community,
>
> As a contributor, I would like to know the current policy for updating UI
> of an operator. I understand UI change should be introduced in major
> release not minor release. However, it is still not quite clear to me
> regarding the UI change process:
>
> 1) Which guideline should we follow when updating the UI in MXNet
> operators?
> 2) Who should approve the UI change?
> 3) In case of backward compatibility, should we favor breaking the backward
> compatibility and update the release notes or adding a newer version of the
> operator like ***_v2?
> 4) Which operator should go to contrib and which be implemented as regular?
>
> Any clarification is appreciated and it is helpful to guide PR reviewers as
> well.
>
> Merry Christmas to ya'all!
>
> Lin
>


Re: [DISCUSS] About the PR merging policy

2018-12-14 Thread Anirudh Acharya
Thanks Qing for bringing this up. I think the cwiki can contain pointers to
apache guidelines -

   - https://www.apache.org/dev/committers.html
   - https://www.apache.org/dev/new-committers-guide.html
   - And a few thumb rules( not hard and fast rules, we should trust the
   committers to act in good faith in most cases) on how many approvals to
   wait for before merging would be good.

And having these in one place in the cwiki would be convenient.


Thanks
Anirudh



On Fri, Dec 14, 2018 at 11:59 AM Carin Meier  wrote:

> Thanks Steffen,
>
> I had remembered reading that but couldn't find it again :)
>
> So yes - maybe we can duplicate that section and/or provide a link to a new
> committers guide.
>
> I'm thinking it should go on the community page here
> https://cwiki.apache.org/confluence/display/MXNET/Community
>
> Eventually, some of information collected there could migrate out the
> webpage as well.
>
> - Carin
>
> On Thu, Dec 13, 2018 at 7:30 AM Steffen Rochel 
> wrote:
>
> > We do have already a guide which covers the issue:
> >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Development+Process#DevelopmentProcess-GuidelinesforReviewers/Committers
> > <
> >
> https://cwiki.apache.org/confluence/display/MXNET/Development+Process#DevelopmentProcess-GuidelinesforReviewers/Committers
> > >,
> > but it probably needs to become more prominent. Any suggestion for a good
> > place?
> > Steffen
> >
> > On Wed, Dec 12, 2018 at 5:23 PM Carin Meier 
> wrote:
> >
> > > Qing - thanks for bringing this up.
> > >
> > > I think it would be a good thing to have a document on the wiki to help
> > > with these sorts of questions.
> > >
> > > In fact, since the project is growing with more new committers, maybe
> we
> > > could use a "New Committer Guide" with the process of how to get going
> > and
> > > any FAQ like this one ...
> > >
> > > Would you be interested in getting a rough draft going of your recent
> > > experience? Then others can help collaborate on it.
> > >
> > > It would be nice to make the path smoother for other new committers to
> > the
> > > project.
> > >
> > > Best,
> > > Carin
> > >
> > > On Tue, Dec 11, 2018 at 7:18 PM Qing Lan  wrote:
> > >
> > > > Hi all,
> > > >
> > > > Recently I self-merged my PR without getting approvals from other
> > > > committers https://github.com/apache/incubator-mxnet/pull/13617 and
> > only
> > > > contributors approval. I apologize to the community and thank Marco
> for
> > > > pointing out the problem. I took a lesson that we should at least
> have
> > > one
> > > > committer’s approval to merge the code. However, I just found this
> > > section
> > > > is missing in the CWiki
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Become+an+Apache+MXNet+%28incubating%29+Committer+and+PPMC+Member
> > > .
> > > > So I would like to discuss in here:
> > > >
> > > > How to conduct the PR reviewing/merging. How many approvals
> (Committers
> > > > and Contributors) we should get in order to merge?
> > > >
> > > > How to deal with disagreement in the discussion (e.g a
> > > > contributor/committer request a change)?
> > > >
> > > > Please don’t hesitate to share your thoughts!
> > > >
> > > > Thanks,
> > > > Qing
> > > >
> > >
> >
>


Re: v1.4.0 status 11/29

2018-12-03 Thread Anirudh Subramanian
Hi Steffen,

I have created a PR to cherry pick the change to v1.4.x branch:
https://github.com/apache/incubator-mxnet/pull/13517

Anirudh

On Mon, Dec 3, 2018 at 11:29 AM Steffen Rochel 
wrote:

> Thanks Haibin. Anirudh - please add PR for v1.4.x for
> https://github.com/apache/incubator-mxnet/pull/13501
> Steffen
>
> On Mon, Dec 3, 2018 at 10:55 AM Haibin Lin 
> wrote:
>
> > It would also be great to include the PR that reverts a commit causing
> cpu
> > performance degradation
> > https://github.com/apache/incubator-mxnet/pull/13501,
> > where num_omp_threads decrease to 1 when multiple GPUs are used, as
> Anirudh
> > reported in
> >
> >
> https://github.com/apache/incubator-mxnet/issues/13449#issuecomment-443388522
> > <
> >
> https://github.com/apache/incubator-mxnet/issues/13449#issuecomment-443388522
> > >
> >
> > Best,
> > Haibin
> >
> > On Mon, Dec 3, 2018 at 10:50 AM Afrooze, Sina 
> wrote:
> >
> > > I would also like this PR which is already merged with master (
> > > https://github.com/apache/incubator-mxnet/pull/13426) to be included
> in
> > > 1.4.0 to avoid any potential ONNX export issues in cases where the API
> is
> > > not used strictly correctly. - Sina
> > >
> > >
> > >
> > > On 11/30/18, 2:17 PM, "Alex Zai"  wrote:
> > >
> > > PR is here https://github.com/apache/incubator-mxnet/pull/13497.
> > >
> > > On Thu, Nov 29, 2018 at 8:56 PM Lv, Tao A 
> > wrote:
> > >
> > > > Credit belongs to Alex.
> > > >
> > > > Hi Alex, would you mind porting your fix to the v1.4.x branch?
> > > >
> > > > Thanks,
> > > > -Tao
> > > >
> > > > -Original Message-
> > > > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > > Sent: Friday, November 30, 2018 12:48 PM
> > > > To: dev@mxnet.incubator.apache.org
> > > > Subject: Re: v1.4.0 status 11/29
> > > >
> > > > Hi Tao - thanks for fixing the crash. Please create PR on v1.4.x
> > > branch
> > > > with [v1.4.x] in title and add me to the PR.
> > > > Steffen
> > > >
> > > > On Thu, Nov 29, 2018 at 8:44 PM Lv, Tao A 
> > > wrote:
> > > >
> > > > > Hi Steffen, I would like to have
> > > > > https://github.com/apache/incubator-mxnet/pull/13433  into the
> > > coming
> > > > > 1.4.0 release. It fixed a crash of deconvolution with certain
> > input
> > > > > size for MKL-DNN backend. This PR is well reviewed and already
> > > merged
> > > > > into the master branch. New test case is also included there.
> > > > >
> > > > > Please find the corresponding issue here:
> > > > > https://github.com/apache/incubator-mxnet/issues/13421 .
> > > > >
> > > > > Thanks,
> > > > > -Tao
> > > > >
> > > > > -Original Message-
> > > > > From: Steffen Rochel [mailto:steffenroc...@gmail.com]
> > > > > Sent: Friday, November 30, 2018 12:05 PM
> > > > > To: dev@mxnet.incubator.apache.org
> > > > > Subject: v1.4.0 status 11/29
> > > > >
> > > > > Dear MXNet community -
> > > > > I would like to provide update on v1.4.0 status, details will
> be
> > > > > tracked here <
> > > > >
> > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incu
> > > > > bating%29+1.4.0+Release+Plan+and+Status
> > > > > >
> > > > > .
> > > > >
> > > > > 1. Sergey created v1.4.x branch
> > > > > 2. As expected, additional requests have been made for
> inclusion
> > in
> > > > > v1.4.0 release. Critical PR are tracked here <
> > > > >
> > > https://cwiki.apache.org/confluence/display/MXNET/Apache+MXNet+%28incu
> > > > >
> > > bating%29+1.4.0+Release+Plan+and+Status#ApacheMXNet(incubating)1.4.0Re
> > > > > leasePlanandStatus-OpenPRstotrack
> > > > > >
> > > > > .
> > > > > 3. PR to update README.md is blocked by flaky test failures,
> > > > > retriggered check.
> > > > > 4. PR to upgrade version on master to v1.5.0 has been
> submitted.
> > > > > 5. CI is setup and first run passed.
> > > > >
> > > > > Note: if you want to add selected fixes or enhancements, please
> > > reply
> > > > > to this email. Please provide justification, add me as approver
> > to
> > > the
> > > > > v1.4.x PR and make sure your changes have tests included in PR
> > and
> > > get
> > > > > properly reviewed.
> > > > >
> > > > > Regards,
> > > > > Steffen
> > > > >
> > > >
> > >
> > >
> > >
> > >
> >
>


Re: Rcpp licensing in Apache MXNet

2018-12-02 Thread Anirudh Acharya
Hi Steffen,

We had a similar discussion with a legal team in Amazon, and we made this
PR - https://github.com/apache/incubator-mxnet/pull/12559 to fix the
licensing issues in the R-package.

I think the R-package should be good to include in the release, but we
should try to get a confirmation once again before we do include it.


Thanks
Anirudh



On Sun, Dec 2, 2018 at 3:22 PM Steffen Rochel 
wrote:

> Hi KK - I'm going through the release checklist
> <
> https://cwiki.apache.org/confluence/display/MXNET/Release+Process#ReleaseProcess-Step1.10.Createartefactsforthereleaseandpushtothedistfolder
> >
> for upcoming v1.4.x release and found the note to remove R-package before
> creating release artifacts. Did we ever get resolution from legal and can
> now include the R-package in the release?
> Appreciate your advice.
>
> Regards,
> Steffen
>
> On Tue, Jul 11, 2017 at 11:49 PM Qiang Kou  wrote:
>
> > Hi, Naveen,
> >
> > I am totally fine if we skip the R pkg for release.
> >
> > Thanks,
> >
> > KK
> >
> > On Tue, Jul 11, 2017 at 8:21 PM, Naveen Swamy 
> wrote:
> >
> > > Ly,
> > >  Can we skip R pkg for the proposed release as KK mentioned and add
> > > it/alter based on the advice we get from ASF legal?
> > >
> > > ---KK Says---
> > > As I understand, if we skip the R pkg when releasing the a new version
> of
> > > MXNet, everything is OK. This is can be done by adding a .gitattribute.
> > > ---
> > >
> > > others,
> > >  thoughts/concerns?
> > >
> > > Thanks, Naveen
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jul 11, 2017 at 3:56 PM, Ly Nguyen 
> wrote:
> > >
> > > > Hey KK,
> > > >
> > > > I know we're planning a release end of this week/beginning of next
> > week.
> > > It
> > > > may be critical to get this cleared if it is an issue. Eager to hear
> > > back.
> > > > :)
> > > >
> > > > On Tue, Jul 11, 2017 at 3:35 PM, Qiang Kou 
> wrote:
> > > >
> > > > > Hi, Ly,
> > > > >
> > > > > I will let you know when I have the answer.
> > > > >
> > > > > Best,
> > > > >
> > > > > KK
> > > > >
> > > > > On Tue, Jul 11, 2017 at 10:50 AM, Ly Nguyen 
> > > wrote:
> > > > >
> > > > > > Hi @KK, any updates from legal on whether excluding the R pkg is
> a
> > > > > solution
> > > > > > for our next release?
> > > > > >
> > > > > > On Mon, Jul 10, 2017 at 10:49 AM, Qiang Kou 
> > > wrote:
> > > > > >
> > > > > > > Thank you for the info.
> > > > > > >
> > > > > > > As I understand, if we skip the R pkg when releasing the a new
> > > > version
> > > > > of
> > > > > > > MXNet, everything is OK. This is can be done by adding a
> > > > .gitattribute.
> > > > > > >
> > > > > > > I will ask on legal-discuss@ for more info and confirmation.
> > > > > > >
> > > > > > > Really thank you for all the info! It is super helpful.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > KK
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Jul 7, 2017 at 8:47 AM, Felix Cheung <
> > > > > felixcheun...@hotmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I was only referring to string_hash_code.c - it's not being
> > built
> > > > and
> > > > > > > it's
> > > > > > > > not part of the binaries release.
> > > > > > > >
> > > > > > > > There are two parts to it.
> > > > > > > >
> > > > > > > > For Spark binaries release, R package is built and the output
> > is
> > > > > > packaged
> > > > > > > > along with the rest of all jars  and python stuff.
> > > > > > > >
> > > > > > > > There is also a source-only R package that we want to publish
> > to
> > > > > CRAN.
> > > > > > &

Re: Adding AMD CPU to CI

2018-11-29 Thread Anirudh Subramanian
Instruction set extensions support like AVX2, AVX512 etc. can vary between
AMD and Intel and there can also be a time lag between when Intel supports
it versus when AMD supports it.
Also, in the future this setup may be useful in case MXNet supports AMD
GPUs and AWS also happens to have support for it.

Anirudh


On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu
 wrote:

> I think it's worth a discussion to do a sanity check. While generally these
> instructions are standardized, we also made the experience with ARM that
> the theory and reality sometimes don't match. Thus, it's always good to
> check.
>
> In the next months we are going to refactor our slave creation processes.
> Chance Bair has been working on rewriting Windows slaves from scratch (we
> used images that haven't really been updated for 2 years - we still don't
> know what was done on them) and they're ready soon. In the following
> months, we will also port our Ubuntu slaves to the new method (don't have a
> timeline yet). Ideally, the integration of AMD instances will only be a
> matter of running the same pipeline on a different instance type. In that
> Case, it should not be a big deal.
>
> If there are big differences, that's already a yellow flag for
> compatibility, but that's unlikely. But in that case, we would have to make
> a more thorough time analysis and whether it's worth the effort. Maybe,
> somebody else could also lend us a hand and help us with adding AMD
> support.
>
> -Marco
>
> Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin 
> geschrieben:
>
> > f16c is also an instruction set supported by both brands' recent CPUs
> just
> > like x86, AVX, SSE etc., and any difference in behaviors (quite
> impossible
> > to happen or it will be a major defect) would most likely be caused by
> the
> > underlying hardware implementation, so still, adding AMD instances is not
> > adding much value here.
> > Hao
> >
> > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Just looked at the mf16c work and wanted to mention Rahul clearly _was_
> > > thinking about AMD users in that PR.
> > >
> > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > > > From my perspective we're developing a few features like mf16c and
> > MKLDNN
> > > > integration specifically for Intel CPUs.  It wouldn't hurt to make
> sure
> > > > those changes also run properly on AMD cpus.
> > > >
> > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin  > > >
> > > >> I'm a bit confused about why we need extra functionality tests just
> > for
> > > >> AMD
> > > >> CPUs, aren't AMD CPUs supporting roughly the same instruction sets
> as
> > > the
> > > >> Intel ones? In the very impossible case that something working on
> > Intel
> > > >> CPUs being not functioning on AMD CPUs (or vice versa), it would
> > mostly
> > > >> likely be related to the underlying hardware implementation of the
> > same
> > > >> ISA, to which we definitely do not have a good solution. So I don't
> > > think
> > > >> performing extra tests on functional aspect of the system on AMD
> CPUs
> > is
> > > >> adding any values.
> > > >> Hao
> > > >>
> > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu
>  > >
> > > >> wrote:
> > > >>
> > > >> > +1
> > > >> >
> > > >> > On 11/29/18, 2:39 PM, "Alex Zai"  wrote:
> > > >> >
> > > >> > What are people's thoughts on having AMD machines tested on
> the
> > > CI?
> > > >> AMD
> > > >> > machines are now available on AWS.
> > > >> >
> > > >> > Best,
> > > >> > Alex
> > > >> >
> > > >> >
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: Adding AMD CPU to CI

2018-11-29 Thread Anirudh Subramanian
+1

On Thu, Nov 29, 2018 at 2:38 PM Alex Zai  wrote:

> What are people's thoughts on having AMD machines tested on the CI? AMD
> machines are now available on AWS.
>
> Best,
> Alex
>


Re: Include MKLDNN into default mxnet pip package

2018-11-27 Thread Anirudh Subramanian
Hi Tao,

I was suggesting we can start using a release tag from mkldnn for major and
minor releases of mxnet starting with 1.4.0. But this would require a
versioning mechanism similar to semver for MKLDNN and  MKLDNN to do patch
release to backport the bug fixes/regressions. I dont know if this is going
to happen anytime soon (It would be nice if you can obtain some timeline
from MKLDNN team on this). As long as the PIP still has two different
packages for mkl and without mkl my vote is +1 for adding it as a default.

Anirudh


On Tue, Nov 27, 2018 at 5:04 AM Lv, Tao A  wrote:

> Hi Anirudh,
>
> Just to confirm, you're focusing on the 1.4.0 release of MXNet and want to
> have a release version of MKL-DNN there, right? Or do you mean all the
> development in the future should base on the release version of MKL-DNN?
> For the former one, I think 0.17 release of MKL-DNN is a good choice. But
> it will not have fix for the LSTM regression mentioned in previous email.
>
> I'm talking about the versioning mechanism with MKL-DNN maintainers and
> will be back to you if I get any response. But from the releasing history
> of MKL-DNN, I cannot find any evidence about patch release.
>
> -tao
>
> -Original Message-
> From: Anirudh Subramanian [mailto:anirudh2...@gmail.com]
> Sent: Tuesday, November 27, 2018 6:16 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Re: Include MKLDNN into default mxnet pip package
>
> Hi Tao,
>
> I agree with Steffen that we can start with a stable release for MKLDNN
> for 1.4.0. For your suggestion on using 0.17, can you provide info on what
> versioning mechanism MKLDNN uses. Once a MKLDNN release is out and there
> are some regressions found like the LSTM regression, would it be possible
> to do a patch release for it or maintain a release branch for it ?
>
> Anirudh
>
> On Sun, Nov 25, 2018 at 5:03 PM Lv, Tao A  wrote:
>
> > Hi Steffen,
> >
> > I think all the commits on MKL-DNN master branch are well tested for
> > MKL-DNN development team. If we really want to have a release commit
> > in the coming 1.4 mxnet release, my suggestion is 0.17 MKL-DNN release.
> >
> > Thank you,
> > Tao
> >
> > Sent from my iPhone
> >
> > > On Nov 26, 2018, at 8:09 AM, Steffen Rochel
> > > 
> > wrote:
> > >
> > > +1 to make MKL-DNN default.
> > > I'm tracking  https://github.com/apache/incubator-mxnet/issues/13369
> > > as open issue to be addressed for 1.4.0 I do agree that we should
> > > move to a model to include released
> > dependencies
> > > instead of just taking bleeding edge snapshots.
> > > However, speed of development is important as well.
> > > As a compromise for 1.4.0 release with MKL-DNN: can the MKL-DNN
> > development
> > > team provide us with a well tested tag/commit id to include in 1.4.0
> > > release?
> > > Steffen
> > >
> > >> On Wed, Nov 21, 2018 at 11:42 PM Lv, Tao A 
> wrote:
> > >>
> > >> Thanks for the information, Kellen and Naveen.
> > >>
> > >> Better than onnx-tensorrt, MKL-DNN has already provided versioning
> > >> and release tags. My concern is that as MKL-DNN is still under
> > >> intensive development, if it has a new feature or bug fix on its
> > >> master branch,
> > do we
> > >> really want to wait for next release to get it supported in MXNet?
> > >>
> > >> Take the LSTM regression as an example, probably MKL-DNN will give
> > >> a fix or improvement on its master branch soon, do we need to wait
> > >> for 0.18 release to get it fixed for mxnet user? AFAIK, tensorflow
> > >> is also using normal commit id, not release, as the dependency for
> MKL-DNN.
> > >>
> > >> Regarding the LSTM regression, we are using internal JIRA tickets
> > >> rather than github issues to track the defects of MKL-DNN. But I
> > >> agree with
> > you,
> > >> we need update the progress of it in Alex's issue.
> > >>
> > >> Thanks,
> > >> -tao
> > >>
> > >> -Original Message-
> > >> From: kellen sunderland [mailto:kellen.sunderl...@gmail.com]
> > >> Sent: Thursday, November 22, 2018 10:55 AM
> > >> To: dev@mxnet.incubator.apache.org
> > >> Subject: Re: Include MKLDNN into default mxnet pip package
> > >>
> > >> Agree with your point about other repos also not being based on
> > versioning
> > >> Tao.  I would point out that I've given some that I've worked with
> > similar

Re: Include MKLDNN into default mxnet pip package

2018-11-26 Thread Anirudh Subramanian
Hi Tao,

I agree with Steffen that we can start with a stable release for MKLDNN for
1.4.0. For your suggestion on using 0.17, can you provide info on what
versioning mechanism MKLDNN uses. Once a MKLDNN release is out and there
are some regressions found like the LSTM regression, would it be possible
to do a patch release for it or maintain a release branch for it ?

Anirudh

On Sun, Nov 25, 2018 at 5:03 PM Lv, Tao A  wrote:

> Hi Steffen,
>
> I think all the commits on MKL-DNN master branch are well tested for
> MKL-DNN development team. If we really want to have a release commit in the
> coming 1.4 mxnet release, my suggestion is 0.17 MKL-DNN release.
>
> Thank you,
> Tao
>
> Sent from my iPhone
>
> > On Nov 26, 2018, at 8:09 AM, Steffen Rochel 
> wrote:
> >
> > +1 to make MKL-DNN default.
> > I'm tracking  https://github.com/apache/incubator-mxnet/issues/13369 as
> > open issue to be addressed for 1.4.0
> > I do agree that we should move to a model to include released
> dependencies
> > instead of just taking bleeding edge snapshots.
> > However, speed of development is important as well.
> > As a compromise for 1.4.0 release with MKL-DNN: can the MKL-DNN
> development
> > team provide us with a well tested tag/commit id to include in 1.4.0
> > release?
> > Steffen
> >
> >> On Wed, Nov 21, 2018 at 11:42 PM Lv, Tao A  wrote:
> >>
> >> Thanks for the information, Kellen and Naveen.
> >>
> >> Better than onnx-tensorrt, MKL-DNN has already provided versioning and
> >> release tags. My concern is that as MKL-DNN is still under intensive
> >> development, if it has a new feature or bug fix on its master branch,
> do we
> >> really want to wait for next release to get it supported in MXNet?
> >>
> >> Take the LSTM regression as an example, probably MKL-DNN will give a fix
> >> or improvement on its master branch soon, do we need to wait for 0.18
> >> release to get it fixed for mxnet user? AFAIK, tensorflow is also using
> >> normal commit id, not release, as the dependency for MKL-DNN.
> >>
> >> Regarding the LSTM regression, we are using internal JIRA tickets rather
> >> than github issues to track the defects of MKL-DNN. But I agree with
> you,
> >> we need update the progress of it in Alex's issue.
> >>
> >> Thanks,
> >> -tao
> >>
> >> -Original Message-
> >> From: kellen sunderland [mailto:kellen.sunderl...@gmail.com]
> >> Sent: Thursday, November 22, 2018 10:55 AM
> >> To: dev@mxnet.incubator.apache.org
> >> Subject: Re: Include MKLDNN into default mxnet pip package
> >>
> >> Agree with your point about other repos also not being based on
> versioning
> >> Tao.  I would point out that I've given some that I've worked with
> similar
> >> feedback: https://github.com/onnx/onnx-tensorrt/issues/68
> >>
> >>> On Wed, Nov 21, 2018 at 6:48 PM Naveen Swamy 
> wrote:
> >>>
> >>> Tao,
> >>>
> >>> You are right there are many submodules in 3rd party. We have to start
> >>> somewhere and I believe this one is a good candidate to start with.
> >>> This is not to cater to release of MXNet or to tie them with the
> >>> releases of the submodules but instead to pick only stable releases
> >>> and not to pick up bleeding edge commits from the tip of the master,
> >>> this gives us confidence in the submodule that MXNet users are
> >>> depending on that especially if we make MKLDNN the default.
> >>>
> >>> Good to know it is known already as a regression.Alex has created this
> >>> issue https://github.com/apache/incubator-mxnet/issues/13369, please
> >>> add details and link the corresponding issue in MKLDNN(I couldn't
> find).
> >>>
> >>> -Naveen
> >>>
> >>>> On Wed, Nov 21, 2018 at 6:04 PM Lv, Tao A  wrote:
> >>>>
> >>>> Here are my answers for the questions from Kellen and Naveen about
> >>>> MKL-DNN. It doesn't mean that I'm supportive for making MKL-DNN
> >>>> default here.
> >>>>
> >>>> @Kellen,
> >>>>
> >>>> FYI, here is a list for those platforms which are officially
> >>>> supported by MKL-DNN.
> >>>> https://github.com/intel/mkl-dnn#system-requirements
> >>>>
> >>>> Most of computation intensive kernels in MKL-DNN are JITed. So they
> >>>> are supposed to generate code accor

Re: Splitting Jenkins pipelines - stop changes to Jenkinsfiles!

2018-11-21 Thread Anirudh
Hi Marco,

Can you point out specifically which checks we have to make sure pass
before merging PRs. Currently apart from the required one there are six
steps added.  Also, is the CI down currently :
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13324/17/pipeline


Anirudh

On Wed, Nov 21, 2018 at 9:31 AM Marco de Abreu
 wrote:

> Please notice that the "continuous-integration/jenkins/pr-merge" currently
> is overlapping with the new pipelines. Please make sure all checks pass
> (also the non-required ones) before merging the PRs. I will work on a fix
> for this overlap.
>
> -Marco
>
> On Wed, Nov 21, 2018 at 5:42 PM Anton Chernov  wrote:
>
> > The ability to retrigger the pipelines separately is an amazing step
> > forward. Great job Marco!
> >
> > ср, 21 нояб. 2018 г. в 15:03, Marco de Abreu
> > :
> >
> > > Hello,
> > >
> > > the PR has been merged and I've created the new pipelines at [1]. You
> can
> > > see the new reports if you have a look at this example PR at [2].
> > >
> > > The new status messages will be the ones starting with
> > > "ci/jenkins/mxnet-validation/".
> > >
> > > This now allows you to retrigger specific pipelines if they fail. For
> > > example, if you're interested in the website pipeline, you can now go
> to
> > > [3] and just retrigger that instead of running the entire suite.
> Whenever
> > > there's a new commit, all pipelines will still be scheduled as before
> > (the
> > > overall behaviour or coverage of our pipeline did not change, I just
> > > decoupled them and increased the usability).
> > >
> > > The next step will be the deprecation of the main Jenkinsfile (the one
> > > which reports the status as "continuous-integration/jenkins/pr-merge")
> > and
> > > requesting these new statuses to be marked as required (protected
> master
> > > branch). Since we have to change some reporting tools to point to the
> new
> > > jobs and I'd like to observe the stability for some time, this will
> take
> > > some times.
> > >
> > > You can now resume changes in the Jenkinsfiles. But please do not
> modify
> > > the Jenkinsfile in the root directory but instead the ones at [4]. The
> > > nightly Jenkinsfiles (or basically all Jenkinsfiles that are not part
> of
> > > the main pipeline) have not been migrated yet and I will do that at a
> > later
> > > point in time.
> > >
> > > Best regards,
> > > Marco
> > >
> > > [1]: http://jenkins.mxnet-ci.amazon-ml.com/job/mxnet-validation/
> > > [2]: https://github.com/apache/incubator-mxnet/pull/13352
> > > [3]:
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fwebsite/detail/PR-13352/1/pipeline
> > > [4]: https://github.com/apache/incubator-mxnet/tree/master/ci/jenkins
> > >
> > > On Tue, Nov 20, 2018 at 9:33 PM Marco de Abreu <
> > > marco.g.ab...@googlemail.com>
> > > wrote:
> > >
> > > > I have just submitted my PR at
> > > > https://github.com/apache/incubator-mxnet/pull/13344. Test jobs are
> > > > available at
> > > > http://jenkins.mxnet-ci-dev.amazon-ml.com/view/test-marco-mxnet/.
> > > >
> > > > As soon as I'm done with my tests, I will mark it as ready for
> review.
> > > >
> > > > Best regards,
> > > > Marco
> > > >
> > > > On Tue, Nov 20, 2018 at 9:09 PM Marco de Abreu <
> > > > marco.g.ab...@googlemail.com> wrote:
> > > >
> > > >> Thanks, Pedro!
> > > >>
> > > >> I have also been looking into that issue, but it seems like this
> would
> > > >> require changes in the groovy interpreter of Jenkins. From what I
> can
> > > tell,
> > > >> a refactor will give us multiple benefits (clarity and speed) aside
> > from
> > > >> resolving this issue.
> > > >>
> > > >> Best regards,
> > > >> Marco
> > > >>
> > > >> Am Di., 20. Nov. 2018, 19:54 hat Pedro Larroy <
> > > >> pedro.larroy.li...@gmail.com> geschrieben:
> > > >>
> > > >>> I think this is a big problem, which has blocked us before. I want
> to
> > > >>> point out that you are doing a great thing by avoiding everyone
> > > >>> ge

Re: A New API for creating .rec files

2018-11-21 Thread Anirudh Acharya
Hi All,

Sorry for the delay, but here is the design spec for the API -
https://cwiki.apache.org/confluence/display/MXNET/Image+Transforms+and+RecordIO+file+Creation

Look forward to feedback from the community.


Regards
Anirudh


On Tue, Sep 25, 2018 at 2:15 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> This makes a lot of sense to me Anirudh.
>
> On Tue, Sep 25, 2018 at 11:38 AM Anirudh Acharya 
> wrote:
>
> > Hi,
> >
> > During some recent MXNet user surveys, one of the user requests was to
> have
> > a im2rec API that will have similar functionality as the im2rec tool(
> > https://mxnet.incubator.apache.org/faq/recordio.html?highlight=im2rec).
> > The
> > advantage with the API would be that the user can access this
> functionality
> > from the PyPi package itself, instead of cloning the repo.
> >
> > I was thinking of converting the tool into an API call under the mx.io
> > package. I will send the API design shortly. I wanted to know what the
> > community thinks of this change.
> >
> >
> > Thanks
> > Anirudh Acharya
> >
>


Re: [Question] Difference between "Feature" and "Feature request" labels in Github

2018-11-13 Thread Acharya, Anirudh
Thanks for doing this.

-
Anirudh

On Nov 13, 2018 5:25 PM, Sheng Zha  wrote:
Oh, I see. I was moving the other 80 or so, so it was probably a
race-condition.
Anyway, thanks for being eager to help.

-sz

On Tue, Nov 13, 2018 at 5:24 PM Naveen Swamy  wrote:

> done now, removed the feature label, there were 4 issues with that label
> but also had Feature Request.
>
> On Tue, Nov 13, 2018 at 5:05 PM Anirudh Acharya 
> wrote:
>
> > This issue was raised before here -
> >
> >
> https://lists.apache.org/thread.html/3e988e6bd82cb2d69ba20c21bf763952ed22a5732e61f6fba1f89ac8@%3Cdev.mxnet.apache.org%3E
> >
> > We need someone with committer privileges to fix it.
> >
> >
> > Thanks
> > Anirudh
> >
> >
> >
> > On Tue, Nov 13, 2018 at 4:36 PM Lin Yuan  wrote:
> >
> > > Dear Community,
> > >
> > > I often see there are "Feature" and "Feature request" labels in Github
> > > issues. May I know the difference? If they are meant to be the same
> > thing,
> > > can we only keep one of them?
> > >
> > > Thanks,
> > >
> > > Lin
> > >
> >
>


Re: [Question] Difference between "Feature" and "Feature request" labels in Github

2018-11-13 Thread Anirudh Acharya
This issue was raised before here -
https://lists.apache.org/thread.html/3e988e6bd82cb2d69ba20c21bf763952ed22a5732e61f6fba1f89ac8@%3Cdev.mxnet.apache.org%3E

We need someone with committer privileges to fix it.


Thanks
Anirudh



On Tue, Nov 13, 2018 at 4:36 PM Lin Yuan  wrote:

> Dear Community,
>
> I often see there are "Feature" and "Feature request" labels in Github
> issues. May I know the difference? If they are meant to be the same thing,
> can we only keep one of them?
>
> Thanks,
>
> Lin
>


Re: Nightly/Weekly tests for examples

2018-11-12 Thread Anirudh Acharya
Hi Ankit,

I have a few concerns about testing examples. Before writing tests for
examples,

   - you will need to first decide what constitutes a test for an example,
   because examples are not API calls, which will have return statements and
   the test can just call the API and assert for certain values. Just testing
   if an example is a compilable python script will not add much value in my
   opinion.
   - And testing for example output and results will require a re-write of
   many of the examples, because many of them currently just have print
   statements as outputs and does not return any value as such. I am not sure
   if it is worth the dev-effort.
   - the current set of examples in the mxnet repo are very diverse - some
   are written as python notebooks, some are just python scripts with paper
   implementations, and some are just illustrations of certain mxnet features.
   I am curious to know how you will write tests for these things.


Looking forward to seeing the design of this test bed/framework.


Thanks
Anirudh Acharya

On Fri, Nov 9, 2018 at 2:39 PM Marco de Abreu
 wrote:

> Hello Ankit,
>
> that's a great idea! Using the tutorial tests as reference is a great
> starting point. If you are interested, please don't hesitate to attend the
> Berlin user group in case you would like to discuss your first thoughts
> in-person before drafting a design.
>
> -Marco
>
>
> Am Fr., 9. Nov. 2018, 23:23 hat khedia.an...@gmail.com <
> khedia.an...@gmail.com> geschrieben:
>
> > Hi MXNet community,
> >
> > Recently, I and a few other contributors focussed on fixing examples in
> > our repository which were not working out of the box as expected.
> > https://github.com/apache/incubator-mxnet/issues/12800
> > https://github.com/apache/incubator-mxnet/issues/11895
> > https://github.com/apache/incubator-mxnet/pull/13196
> >
> > Some of the examples failed after API changes and remained uncaught until
> > a user reported the issue. While the community is actively working on
> > fixing it, it might re-occur after few days if we don’t have a proper
> > mechanism to catch regressions.
> >
> > So, I would like to propose to enable nightly/weekly tests for the
> > examples similar to what we have for tutorials to catch any such
> > regressions. The test could check only basic functionalities/working of
> the
> > examples. It can run small examples completely whereas it can run long
> > training examples for only few epochs.
> >
> > Any thoughts from the community? Any other suggestions for fixing the
> same?
> >
> > Regards,
> > Ankit Khedia
> >
>


Re: Run Sphinx checks on MXNet CI

2018-11-11 Thread Anirudh Acharya
Thanks for the reply Aaron. Once the existing Sphinx errors are fixed and
codebase is cleaned up, lets definitely revisit this and try to add a
Sphinx build into the CI pipeline so that we can prevent MXNet
documentation from breaking.

Thanks
Anirudh

On Thu, Nov 8, 2018 at 5:16 PM Aaron Markham 
wrote:

> Hi Anirudh,
>
> Once the existing errors in docs building are cleaned up, I'm all for
> having CI bubble up a build break when docs are broken by a PR. That
> way we're keeping things up to date and not letting a minor bug turn
> into a serious issue for the entire API documentation. One break
> causes a ripple effect that can go unnoticed and then trying to find
> it is like a needle in a haystack when there are already thousands of
> warnings or errors that are being ignored.
>
> We now have several docs troubleshooting tips in a documentation guide
> thanks to the contributions of @frankfliu, @Roshrini, @vandanavk,
> @vdantu, @vrakesh, and @zachgk.
>
> This documentation guide is published on the developer cwiki:
> https://cwiki.apache.org/confluence/display/MXNET/Documentation+Guide
>
> I plan to continue to add to this guide as more PRs come in that
> exhibit new ways of handling errors or warnings. That way, creating
> docs and troubleshooting any build issues will be much easier.
>
> Please let me know if you have any questions or feedback. You can
> always add that directly to the wiki too.
>
> Cheers,
> Aaron
>
>
> On Thu, Nov 8, 2018 at 11:21 AM Anirudh Acharya 
> wrote:
> >
> > Hi,
> >
> > Recently there was a barrage of issues related to documentation that was
> > raised here -
> > https://github.com/apache/incubator-mxnet/issues/created_by/aaronmarkham
> > All the issues are related to Sphinx errors and warnings. These errors
> > often lead to broken documentation. Ideally such errors should be caught
> > before a PR gets merged, on the CI.
> >
> > Since we use Sphinx to generate the documentation for MXNet, can we have
> > the CI run Sphinx tests on every PR so that we can preempt the problem of
> > broken documentation.
> >
> > Any thoughts from the community? What might be involved to make this
> change
> > to the CI?
> >
> >
> > Regards
> > Anirudh Acharya
>


Re: Map OpenCV assertions to mxnet::Error

2018-11-08 Thread Anirudh
Hi Lieven,

Thanks a lot for this proposal and welcome to the community ! Apologies for
the delay in the reply.
I think it is a nice proposal and opencv exceptions are good point to start
from.
Would you be able to add the proposal to a new cwiki or add the proposal to
the existing cwiki that you linked.
I suggest you add your new error struct to io.h. Also, I dont think for the
proposal, you would need to make any frontend changes.

Would you also be willing to add a phase 2 for the proposal which addresses
the following:

1. How will these errors be propagated to the frontend ? We need to have a
mapping of error codes from backend to frontend to communicate what kind of
exception it is.
2. Handling of std::exception.

Anirudh



On Sun, Nov 4, 2018 at 2:54 AM Lieven Govaerts  wrote:

> Hi MXNet devs,
>
>
> I'd like some feedback on the following proposal before I start
> implementing it.
>
> Context:
> I am working on migrating a classification product currently using Caffe to
> MXNet. Along the way I'm encountering some issues loading and augmenting
> the images dataset.
>
> Basically it seems my dataset contains some technically invalid images.
> When loading them using mx.io.ImageRecordIter (from a Python script), they
> get passed eventually to the OpenCV library which will throw a C++
> exception. MXNet currently doesn't capture those, resulting in my script
> aborting with a not very clear error message:
>
> "
> terminate called after throwing an instance of 'cv::Exception'
>
>   what():  OpenCV(3.4.3)
> /home/lgo/dev/opencv-3.4.3/modules/imgproc/src/resize.cpp:4044: error:
> (-215:Assertion failed) !ssize.empty() in function 'resize'
>
> Aborted (core dumped)
> "
>
> These type of issues have been reported before and I see a high level
> action plan has been documented in the wiki:
>
> https://cwiki.apache.org/confluence/display/MXNET/Improved+Exception+Handling+in+MXNet+-+Phase+2
>
> See also my previous pull request, which prevents OpenCV assertions by
> re-implementing the same checks in MXNet code:
> https://github.com/apache/incubator-mxnet/pull/12999
>
>
> As I'm focused now on data loading and OpenCV, I would like to propose the
> following implementation steps:
> 1. Catch cv:exception in all calls to OpenCV functions that can raise one
> (cv::resize, cv::imdecode, cv::addWeighted, cv::mean, cv::copyMakeBorder,
> cv::warpAffine ..)
> => a new macro CHECK_CV_NO_ASSERT
>
> 2. Create a new mxnet::Error class for OpenCV exceptions. Map the
> cv::exception fields to this new Error class: code, err, file, func, line,
> msg, what.
> Make the CHECK_CV_NO_ASSERT macro throw this new mxnet::Error.
> => struct OpenCVError: public dmlc::Error
>
> 3. Add unit tests where possible.
>
> Scope: There are many calls to OpenCV function in different parts of the
> MXNet code. I plan to focus on:
> - src/io/image_*
> - src/ndarray/ndarray.cc
> - plugin/opencv/cv_api.cc
>
> The other modules (R-package, cpp-package, example, julia, tools,
> plugin/sframe) are related to programming languages I don't use. The sframe
> plugin is not documented at all so it's not clear what it does (or why
> you'd keep it in the repo).
>
> Is include/mxnet/base.h a good place to define the new macro and Error
> struct? I'm not sure which include file is visible in all places where
> OpenCV calls are currently used.
>
> Some assumptions:
> - The public API may contain references to 3rd party library OpenCV
> - There is some value in knowing if an Error is the result of a call to the
> OpenCV library. If not, I might as well wrap std::Exception in a more
> generic way.
>
> If I just make these changes the main process will still abort, but now at
> least with a clear error message + stack trace(*). Updating all processing
> codes to handle OpenCVError's correctly is a next step, outside the scope
> of this proposal.
>
> regards,
>
> Lieven
>
>
> (*) Example stack trace:
>
> [23:31:30] src/io/iter_image_recordio_2.cc:172: ImageRecordIOParser2:
> ./train.txt.rec, use 1 threads for decoding..
>
> [23:31:34] src/io/iter_image_recordio_2.cc:172: ImageRecordIOParser2:
> ./val.txt.rec, use 1 threads for decoding..
>
> Traceback (most recent call last):
>
>   File "./test_train_carmodel_resnet.py", line 126, in 
>
> for i, batch in enumerate(train_data):
>
>   File "/home/lgo/dev/incubator-mxnet/python/mxnet/io/io.py", line 228, in
> __next__
>
> return self.next()
>
>   File "/home/lgo/dev/incubator-mxnet/python/mxnet/io/io.py", line 856, in
> next
>
> check_call(_LIB.MXDataIterNext(self.handle, ctypes.byref(next_res)))
>
>   File "/home/lgo/dev/incubator-

Run Sphinx checks on MXNet CI

2018-11-08 Thread Anirudh Acharya
Hi,

Recently there was a barrage of issues related to documentation that was
raised here -
https://github.com/apache/incubator-mxnet/issues/created_by/aaronmarkham
All the issues are related to Sphinx errors and warnings. These errors
often lead to broken documentation. Ideally such errors should be caught
before a PR gets merged, on the CI.

Since we use Sphinx to generate the documentation for MXNet, can we have
the CI run Sphinx tests on every PR so that we can preempt the problem of
broken documentation.

Any thoughts from the community? What might be involved to make this change
to the CI?


Regards
Anirudh Acharya


Re: MXNet - Label Bot functionality

2018-11-01 Thread Anirudh Acharya
Hi Harsh,

Thanks for working on this. This will be very helpful for people who triage
issues and reviews PRs regularly.

Few concerns from this design document -
https://cwiki.apache.org/confluence/display/MXNET/Machine+Learning+Based+GitHub+Bot
and
the conversation in the comment section

   1. As the scope of the label bot increases, the need for a safety checks
   on who the label bot listens to becomes important. Currently the bot just
   adds labels. You have proposed that the bot also be allowed to remove and
   update labels. And I think allowing the bot to close issues( with a command
   like @mxnet-label-bot close) is in discussion in the comments section. This
   opens up a serious security flaw - anyone who wishes to abuse this system,
   can randomly start closing issues or removing labels. You need to come up
   with a solution so that the bot does not listen to random strangers on the
   internet.
   2. Also you seem to have linked a document that goes to an internal
   amazon website, please remove that.
   [image: Screen Shot 2018-11-01 at 2.31.13 PM.png]


Thanks
Anirudh


On Thu, Oct 18, 2018 at 1:51 PM Harsh Patel 
wrote:

> Hey,
> After having my PR vetted and reviewed by some contributors, I would like
> to move forward into the stage of putting this into production. I am asking
> for MXNet committers to take a look at my PR regarding the Label Bot.
> https://github.com/MXNetEdge/mxnet-infrastructure/pull/15. This will also
> require access for a webhook - let's set this into motion. Thanks.
>
> Best,
> -Harsh
>
> On Mon, Oct 15, 2018 at 4:05 PM Piyush Ghai  wrote:
>
> > Hi Harsh,
> >
> > Good job! This is super cool! Especially bringing down the response time
> > to under 20 seconds.
> >
> > Thanks,
> > Piyush
> >
> >
> > > On Oct 15, 2018, at 3:49 PM, Qing Lan  wrote:
> > >
> > > Hi Harsh,
> > >
> > > This new label bot design looks great! I would like to encourage people
> > to review it and move forward to benefit the MXNet community.
> > > Since this new design needs webhook support from Apache, let's go
> > through the following steps to get this done:
> > >
> > > 1. Demo and contributors review stage: all contributors are encouraged
> > to review the PR here:
> > > https://github.com/MXNetEdge/mxnet-infrastructure/pull/15 and leave
> > your thoughts so Harsh can apply them in his design.
> > > 2. Committers review stage: Once all contributors think the design is
> > good to go, let's get committers involved to get a review.
> > > 3. Committers send request to Apache Infra to get the webhook setup.
> > > 4. Harsh finally deploy the model and all of us can use it in
> > incubator-mxnet repo!
> > >
> > > Some fun fact I would like to share:
> > > 1. This new bot can recommend labels and reply to people who file it!
> > > 2. It response time from 5mins -> less than 20 seconds
> > >
> > > Thanks,
> > > Qing
> > >
> > > On 10/15/18, 11:11 AM, "Harsh Patel" 
> > wrote:
> > >
> > >Hey,
> > >I have a demo available that users and developers can play around
> > with --
> > >this is in regards to the post I had made regarding the updated
> label
> > bot
> > >functionality. This is available on my fork (
> > >https://github.com/harshp8l/incubator-mxnet) if the developers
> would
> > be
> > >able to provide feedback that would be great.
> > >The updated usage of this label bot:
> > >To add labels: @mxnet-label-bot, add ['label1', 'label2']
> > >To remove labels: @mxnet-label-bot, remove ['label1', 'label2']
> > >To update labels: @mxnet-label-bot, update ['label3', 'label4']
> > >(warning: with update - this will remove all other labels for a
> > specific
> > >issue and update only with the labels the user specifies). Thanks.
> > >
> > >My PR for reference:
> > >https://github.com/MXNetEdge/mxnet-infrastructure/pull/15
> > >
> > >My Design:
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Machine+Learning+Based+GitHub+Bot
> > >
> > >Best,
> > >-Harsh
> > >
> > >On Mon, Oct 15, 2018 at 12:54 AM Hagay Lupesko 
> > wrote:
> > >
> > >> +1
> > >> Thanks for the contribution!
> > >>
> > >> On Fri, Oct 12, 2018 at 1:41 AM kellen sunderland <
> > >> kellen.sunderl...@gmail.com> wrote:
> > >>
> > >>> Awesome work!  Many thanks.

Re: [Discussion] Recognise Reviewers, Besides Committers and PMC

2018-10-23 Thread Anirudh
 However, it is very hard to get contributors to do code reviews
> unless
> >> we
> >> >> solicit them. It is definitely harder than getting code
> >> contributions.  The
> >> >> Reviewer mechanism could provide a way to do so. We can recognize
> >> >> contributors, bring them as reviewers and encourage them to do the
> code
> >> >> reviews by explicitly soliciting. The reviewers can learn from the
> >> >> committer reviews,
> >> >> which serves as a role model for what is being expected. Naturally,
> >> this
> >> >> likely helps us find more good reviewers and bought them committer.
> >> >>
> >> >> Cheers
> >> >> Tianqi
> >> >>
> >> >> On Mon, Oct 22, 2018 at 1:09 PM Anirudh 
> wrote:
> >> >>
> >> >>> -1. I dont see the need for additional level of hierarchy. I totally
> >> am
> >> >> for
> >> >>> recognizing good code reviewers. We can recognize this by making
> them
> >> >>> committers. Being a good reviewer should be sufficient to become a
> >> >>> committer in my opinion. (Assuming that there is a seperation
> between
> >> >> PPMC
> >> >>> and committers).
> >> >>>
> >> >>> Anirudh
> >> >>>
> >> >>> On Mon, Oct 22, 2018 at 8:28 AM Qing Lan 
> wrote:
> >> >>>
> >> >>>> +1
> >> >>>> Let's have a reviewer list somewhere with a certain format: such as
> >> >> C++,
> >> >>>> Gluon, Scala/Java based on language or some other category. etc. In
> >> the
> >> >>>> future, label bot would automatically assign reviewers based on
> this
> >> >> kind
> >> >>>> of documentation.
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Qing
> >> >>>>
> >> >>>> On 10/21/18, 11:44 PM, "YiZhi Liu"  wrote:
> >> >>>>
> >> >>>> +1
> >> >>>> I also suggest add reviewer list link to the PR template, so
> that
> >> >>>> developers can easily request review from those reviewers.
> >> >>>> On Sun, Oct 21, 2018 at 8:30 PM Tianqi Chen  >
> >> >>> wrote:
> >> >>>> >
> >> >>>> > I was suggesting something more concrete:
> >> >>>> >
> >> >>>> > - Add a Reviewers section to
> >> >>>> >
> >> >>>>
> >> https://github.com/apache/incubator-mxnet/blob/master/CONTRIBUTORS.md
> >> >> to
> >> >>>> > list a list of Reviewers.
> >> >>>> > - This is a "pesudo role", but holds weight as committers
> >> >>> should
> >> >>>> highly
> >> >>>> > value their reviews during the PR process.
> >> >>>> > - The committers/PMC could actively look for good
> contributors
> >> >> and
> >> >>>> nominate
> >> >>>> > them as Reviewer.
> >> >>>> > - Contributors are encouraged to seek reviews from the list
> of
> >> >>>> reviewers.
> >> >>>> > - The committers should actively solicit code reviews from
> the
> >> >>>> reviewers
> >> >>>> > when reviewing PRs and take their reviews into serious
> >> >>> consideration.
> >> >>>> >
> >> >>>> > - PMCs should actively look for new committers in the current
> >> >>>> Reviewers
> >> >>>> >- Notably, the history reviews plus contribution likely
> will
> >> >>>> provide a
> >> >>>> > good indication on whether the person can uphold the quality
> >> >>>> standard of
> >> >>>> > the codebase, and provide helpful feedbacks(which is the
> trait
> >> >> that
> >> >>>> needed
> >> >>>> > from committer to merge code)
> >> >>>> >
> >> >>>> > Tianqi
> >> >>>> >
> >> >>

Re: [Discussion] Recognise Reviewers, Besides Committers and PMC

2018-10-22 Thread Anirudh
-1. I dont see the need for additional level of hierarchy. I totally am for
recognizing good code reviewers. We can recognize this by making them
committers. Being a good reviewer should be sufficient to become a
committer in my opinion. (Assuming that there is a seperation between PPMC
and committers).

Anirudh

On Mon, Oct 22, 2018 at 8:28 AM Qing Lan  wrote:

> +1
> Let's have a reviewer list somewhere with a certain format: such as C++,
> Gluon, Scala/Java based on language or some other category. etc. In the
> future, label bot would automatically assign reviewers based on this kind
> of documentation.
>
> Thanks,
> Qing
>
> On 10/21/18, 11:44 PM, "YiZhi Liu"  wrote:
>
> +1
> I also suggest add reviewer list link to the PR template, so that
> developers can easily request review from those reviewers.
> On Sun, Oct 21, 2018 at 8:30 PM Tianqi Chen  wrote:
> >
> > I was suggesting something more concrete:
> >
> > - Add a Reviewers section to
> >
> https://github.com/apache/incubator-mxnet/blob/master/CONTRIBUTORS.md to
> > list a list of Reviewers.
> > - This is a "pesudo role", but holds weight as committers should
> highly
> > value their reviews during the PR process.
> > - The committers/PMC could actively look for good contributors and
> nominate
> > them as Reviewer.
> > - Contributors are encouraged to seek reviews from the list of
> reviewers.
> > - The committers should actively solicit code reviews from the
> reviewers
> > when reviewing PRs and take their reviews into serious consideration.
> >
> > - PMCs should actively look for new committers in the current
> Reviewers
> >- Notably, the history reviews plus contribution likely will
> provide a
> > good indication on whether the person can uphold the quality
> standard of
> > the codebase, and provide helpful feedbacks(which is the trait that
> needed
> > from committer to merge code)
> >
> > Tianqi
> >
> >
> > On Sun, Oct 21, 2018 at 5:13 PM Steffen Rochel <
> steffenroc...@gmail.com>
> > wrote:
> >
> > > +1
> > > With the release announcement for MXNet 1.3 all contributors incl.
> code
> > > reviewers have been recognized. I suggest all future release
> announcements
> > > should include such recognition. Are you suggesting to highlight
> most
> > > active reviewers in release announcement or regularly (e.g.
> monthly),
> > > specifically from non-committers?
> > >
> > > On Sun, Oct 21, 2018 at 10:11 AM Tianqi Chen 
> wrote:
> > >
> > > > Also re another email-thread(I sent out one with my
> institutional email
> > > > which get blocked initially, so this one was a bit duplication
> of that).
> > > I
> > > > think it should really be the job of committers to recognize
> potential
> > > > reviewers, github also makes it easier to do so, e.g.
> > > >
> > > >
> > >
> https://github.com/apache/incubator-mxnet/pulls?utf8=%E2%9C%93=reviewed-by%3Apiiswrong
> > > >
> > > > Tianqi
> > > >
> > > > On Fri, Oct 19, 2018 at 12:05 PM Carin Meier <
> carinme...@gmail.com>
> > > wrote:
> > > >
> > > > > +1 Great idea. Adding a name to the contributor list is a good
> idea.
> > > > Also,
> > > > > I've found that thanking the person for the review on the PR
> is another
> > > > way
> > > > > to express gratitude for their time and effort.
> > > > >
> > > > > On Fri, Oct 19, 2018 at 2:51 PM Tianqi Chen 
> wrote:
> > > > >
> > > > > > Dear MXNet Community:
> > > > > >
> > > > > > There is a great discussion going on in terms of lowering
> the barrier
> > > > of
> > > > > > entries and encourage more contribution to the project.  One
> of the
> > > > > general
> > > > > > goals is to encourage a broader pool of contributions. I
> want to make
> > > > the
> > > > > > following proposal:
> > > > > >
> > > > > > Besides Committers and PMC, let us also recognize Reviewers
> in the
> > > > > > community.

A New API for creating .rec files

2018-09-25 Thread Anirudh Acharya
Hi,

During some recent MXNet user surveys, one of the user requests was to have
a im2rec API that will have similar functionality as the im2rec tool(
https://mxnet.incubator.apache.org/faq/recordio.html?highlight=im2rec). The
advantage with the API would be that the user can access this functionality
from the PyPi package itself, instead of cloning the repo.

I was thinking of converting the tool into an API call under the mx.io
package. I will send the API design shortly. I wanted to know what the
community thinks of this change.


Thanks
Anirudh Acharya


Re: [VOTE] Release MXNet version 1.3.0.RC0

2018-09-06 Thread Anirudh
-1 Considering that using fp16 with gluon is much easier than the
alternative where you need access to the model code, this fix is really
useful. I understand the pain of doing mxnet release and appreciate Roshani
and Shengs efforts, but this seems like something we should fix.

On Thu, Sep 6, 2018, 4:57 PM Haibin Lin  wrote:

> +1 built from source and passes dist_sync_kvstore test on Ubuntu.
>
> Best,
> Haibin
>
> On Thu, Sep 6, 2018 at 1:32 PM Indhu  wrote:
>
> > +1
> >
> > The release candidate looks good. I'm able to build and run basic models.
> >
> > One the FP16 issue:
> >
> > Like others have pointed out, releases on expensive in terms of time and
> > effort. There needs to be a high and more objective bar on what qualifies
> > as a release blocker to make sure we are not setting precedence for a lot
> > of release blockers in future.
> >
> > I think a release blocker is justified only if there is a serious bug
> > discovered in one of the features included in the release or if there is
> a
> > regression. Given FP16 supports is not a new feature claimed in this
> > release and this is not a regression in this release candidate, I'm
> > inclined to release this candidate and include the FP16 fix in a
> subsequent
> > release.
> >
> > Thanks,
> > Indu
> >
> > On Wed, Sep 5, 2018 at 10:21 AM Aaron Markham  >
> > wrote:
> >
> > > 0 (non-binding) If we have a problem that blocks users, and a solution
> in
> > > hand... then we should fix it, but not at the expense of starting the
> > > release cycle again just for one fix. Users can cherry pick or build
> from
> > > master if they want the fix right away, right? I'd change my mind to -1
> > if
> > > this wasn't the case, with good reason, and if the user impact was
> > critical
> > > to adoption or risks abandonment.
> > >
> > >
> > > On Wed, Sep 5, 2018 at 9:57 AM Roshani Nagmote <
> > roshaninagmo...@gmail.com>
> > > wrote:
> > >
> > > > I believe everyone here is working hard to make MXNet a better
> > framework
> > > > for users. It's completely okay to have different opinions, we can
> > decide
> > > > together if this issue is a blocker or not after voting time is over.
> > > >
> > > > As I mentioned before, voting will end at 7 pm today. So there is
> still
> > > > time to test the release. If there are any other issues anyone
> finds, I
> > > > will be happy to start the process again and work on RC1. For now, I
> > want
> > > > to encourage everyone to utilize this time and vote. :)
> > > >
> > > > Thanks,
> > > > Roshani
> > > >
> > > > On Tue, Sep 4, 2018 at 10:35 PM sandeep krishnamurthy <
> > > > sandeep.krishn...@gmail.com> wrote:
> > > >
> > > > >1. As a Apache MXNet community member, I raised the concern of
> > > broken
> > > > >functionality for the user. I explained and provided the data
> > points
> > > > on
> > > > > the
> > > > >issue, workaround and why I think it is important. If after all
> > > this,
> > > > > you
> > > > >think my vote is biased on my employer just because a user I
> > quoted
> > > is
> > > > > from
> > > > >Amazon, this is more concerning to me on my voting abilities.
> > > > >2. My -1 no where undermines the huge amount of effort that goes
> > > > behind
> > > > >the scene for a release to happen. Great respect and recognition
> > for
> > > > >everyone involved in all the releases of MXNet in the past and
> > > this. I
> > > > >voted on my judgement of what may be good for the users of
> MXNet.
> > > > >3. As pointed by Naveen & Chris, -1 are NOT veto. Feel free to
> > > decide
> > > > >and progress on the release as we already have >3 +1 in this
> > thread.
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Sandeep
> > > > >
> > > > > On Tue, Sep 4, 2018 at 8:29 PM Chris Olivier <
> cjolivie...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > btw, there are no vetoes on package releases:
> > > > > >
> > > > > > VOTES ON PACKAGE RELEASES
> > > > > > 
> > > > > >
> > > > > > Votes on whether a package is ready to be released use majority
> > > > approval
> > > > > > <
> https://www.apache.org/foundation/glossary.html#MajorityApproval>
> > > --
> > > > > i.e.
> > > > > > at least three PMC members must vote affirmatively for release,
> and
> > > > there
> > > > > > must be more positive than negative votes.Releases may not be
> > vetoed.
> > > > > > Generally
> > > > > > the community will cancel the release vote if anyone identifies
> > > serious
> > > > > > problems, but in most cases the ultimate decision, lies with the
> > > > > individual
> > > > > > serving as release manager. The specifics of the process may vary
> > > from
> > > > > > project to project, but the 'minimum quorum of three +1 votes'
> rule
> > > is
> > > > > > universal.
> > > > > >
> > > > > > On Tue, Sep 4, 2018 at 7:12 PM Sheng Zha 
> > wrote:
> > > > > >
> > > > > > > Thanks for sharing your opinions, Thomas. Your recognition 

Re: Propose to discontinue supporting Apache MXNet on Windows 7

2018-08-28 Thread Anirudh Acharya
+1 for discontinuing.

On Tue, Aug 28, 2018 at 4:11 PM Naveen Swamy  wrote:

> +1 to stop supporting Win7
>
> On Tue, Aug 28, 2018 at 3:54 PM Lin Yuan  wrote:
>
> > Dear Community,
> >
> >
> >
> > Currently, our MXNet installation guide for Windows does not work for
> > Windows 7. e.g. Microsoft Visual Studio 2015 is not supported on Windows
> 7
> > <
> >
> https://visualstudio.microsoft.com/vs/support/vs2015/received-error-specified-program-requires-newer-version-windows/
> > >.
> > In addition, MSFT ended “Mainstream” support for Windows 7 in 2015 (
> >
> https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet
> > ).
> > Therefore, it is not possible for developers to build MXNet and verify
> the
> > fix on Windows 7 platform. Given that there have been several issues
> about
> > MXNet error on Windows 7 (issue#9271
> > , issue #8921
> > , issue #11163
> > ), it will even
> > add
> > more burden on developers in the future if we were to continue supporting
> > Windows 7.
> >
> >
> >
> > I therefore would like to propose that we discontinue the support of
> MXNet
> > on Windows 7 in the next release.
> >
> >
> > Specifically, this means the following required actions:
> >
> > 1) state the discontinuation of Windows 7 support in the release note
> >
> > 2) update the MXNet webpage if Windows version is mentioned.
> >
> > 3) update the open Github issues related to Windows 7
> >
> >
> > Please share your thoughts about this proposal and/or suggest if there is
> > any other missing action item from the above.
> >
> >
> > Best Regards,
> >
> >
> > Lin
> >
>


Re: Duplication of Operators for sampling from random distributions

2018-07-24 Thread Anirudh Acharya
Thank for the reply and the clarification, Haibin.

On Tue, Jul 24, 2018 at 2:31 PM Haibin Lin  wrote:

> Hi Anirudh,
>
> Thanks for asking this on dev@. I looked at the doc for sample_uniform and
> random_uniform, and found that the API is different. For sample_uniform,
> the type of arguments `low` and `high` is NDArray, while that of
> random_uniform's is float. I don't think they're going to be deprecated.
>
> The recommended API to generate a random number is via the ndarray.random.*
> or symbol.random.*, which accept both float and NDArray, and under the hood
> invoke either sample_xxx or random_xxx correspondingly.
>
> Best,
> Haibin
>
> On Mon, Jul 23, 2018 at 1:42 PM, Anirudh Acharya 
> wrote:
>
> > Hi All,
> >
> > I had earlier filed an issue with functionality-duplication/code-refactor
> > here - https://github.com/apache/incubator-mxnet/issues/11811
> >
> > As per the suggestion in the github issue I would like to bring it to the
> > attention of the wider community -
> >
> > The operators defined in sample_op.cc and multisample_op.cc are seemingly
> > performing the same tasks. Both these files define the following
> operators
> > respectively
> >
> > sample_op.cc
> > ---
> > random_uniform
> > random_normal
> > random_gamma
> > random_exponential
> > random_poisson
> > random_negative_binomial
> > random_generalized_negative_binomial
> >
> > multisample_op.cc
> > --
> > sample_uniform
> > sample_normal
> > sample_gamma
> > sample_exponential
> > sample_poisson
> > sample_negative_binomial
> > sample_generalized_negative_binomial
> >
> > The only difference that I can glean from the documentation is that
> > operators in multisample_op.ccperforms concurrent sampling from multiple
> > distributions, but the behavior of the operators is not different.
> >
> > Is sample_op.cc being retained for legacy reasons or backward
> > compatibility? Can it be deprecated or EOLed? Correct me if I am wrong
> > here.
> >
> >
> > Thanks
> >
> > Anirudh
> >
>


Re: Publish MXNet images to DockerHub

2018-07-24 Thread Anirudh Acharya
Yes that would be good. Also I just noticed that in the Installation
instructions page only python has docker image installation instruction
here -
http://mxnet.incubator.apache.org/install/index.html?platform=Linux=Python=CPU
Similar instructions need to be there for other bindings too.

On Tue, Jul 24, 2018 at 4:04 PM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> I was actually interested in pushing a version of MXNet with TensorRT
> enabled some time in the next few weeks just so that people can experiment
> with the feature without worrying about installing the right protoc and
> onnx versions.  If people here think it's a good idea I can open a PR with
> a runtime-docker folder with the intent that this work could be a template
> for others who want to contribute runtime Dockerfiles?  If a few
> contributors do put together an Dockerfile with TensorRT enabled, would it
> be possible to get that image pushed to the MXNet Dockerhub repo by a
> committer?
>
> On Sun, Jul 22, 2018 at 3:57 PM Anirudh Acharya 
> wrote:
>
> > @Naveen No, I meant in general, for all bindings. Irrespective of whether
> > we use a package management repository, being able to pull an image from
> > docker hub would be convenient for anyone wanting to get started on MXNet
> > or run services( as Kellen said).
> >
> >
> > On Sun, Jul 22, 2018 at 11:20 AM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > I think it's a good idea Anirudh.  It should help users easily get
> MXNet
> > up
> > > and running whether they're running services, following tutorials, etc.
> > >
> > > On Sun, Jul 22, 2018 at 8:10 AM Naveen Swamy 
> wrote:
> > >
> > > > I don't think we need for JVM languages, they have a good dependency
> > > > management through Maven Central. We weren't publishing regularly to
> > > Maven,
> > > > now we do.
> > > >
> > > > Anirudh, I am guessing you are interested docker for R language, If
> > the R
> > > > packages were published to CRAN do you still see a need for docker ?
> > > Could
> > > > you elaborate how this would be helpful and easy if they were to use
> > > other
> > > > packages in CRAN?
> > > >
> > > > On Sat, Jul 21, 2018 at 10:51 PM, Anirudh Acharya <
> > anirudhk...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Yes, correct cu90 is indeed there, thanks for pointing it.
> > > > >
> > > > > So the question, should we be publishing to Docker Hub as part of
> the
> > > > > release process so that bindings other than python are also
> published
> > > and
> > > > > there is a policy on what cuda versions we publish?
> > > > >
> > > > >
> > > > > Thanks
> > > > > ANirudh
> > > > >
> > > > > On Sat, Jul 21, 2018 at 9:56 PM Mu Li  wrote:
> > > > >
> > > > > > cu90 and cu90mkl are also available, see
> > > > > > https://hub.docker.com/r/mxnet/python/tags/
> > > > > >
> > > > > > On Sat, Jul 21, 2018 at 9:51 PM, Anirudh Acharya <
> > > > anirudhk...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > The python binding that is actively maintained is
> > > > > > >
> > > > > > > mxnet-mkl  1.2.1
> > > > > > >
> > > > > > >
> > > > > > > Other versions that use CUDA like mxnet-cu and
> > mxnet-cumkl
> > > > are
> > > > > > not
> > > > > > > actively maintained.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -
> > > > > > >
> > > > > > > Anirudh
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Jul 21, 2018 at 9:09 PM Mu Li 
> > wrote:
> > > > > > >
> > > > > > > > Surprisingly only the python binding is actively maintained.
> I
> > > > > remember
> > > > > > > we
> > > > > > > > can easily push all bindings into docker hub through the
> script
> > > in
> > > > > > > > https://github.com/apache/incubator-mxnet/tree/master/docker
> .
> > > > > > > >
> > > > > > > > On Sat, Jul 21, 2018 at 5:03 PM, Anirudh Acharya <
> > > > > > anirudhk...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Docker Hub( https://hub.docker.com/u/mxnet/ ) currently
> > hosts
> > > > > images
> > > > > > > of
> > > > > > > > > MXNet and its various bindings but it is not actively
> > > maintained.
> > > > > > > Should
> > > > > > > > we
> > > > > > > > > publish MXNet images to Docker Hub as part of the release
> > > process
> > > > > and
> > > > > > > > > actively maintain it?
> > > > > > > > >
> > > > > > > > > The pros of publishing docker images would be ease of use
> and
> > > > > access
> > > > > > to
> > > > > > > > our
> > > > > > > > > users. Is this something that should be included as part of
> > the
> > > > > > release
> > > > > > > > > process? What does the community think?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > > Anirudh Acharya
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Duplication of Operators for sampling from random distributions

2018-07-23 Thread Anirudh Acharya
Hi All,

I had earlier filed an issue with functionality-duplication/code-refactor
here - https://github.com/apache/incubator-mxnet/issues/11811

As per the suggestion in the github issue I would like to bring it to the
attention of the wider community -

The operators defined in sample_op.cc and multisample_op.cc are seemingly
performing the same tasks. Both these files define the following operators
respectively

sample_op.cc
---
random_uniform
random_normal
random_gamma
random_exponential
random_poisson
random_negative_binomial
random_generalized_negative_binomial

multisample_op.cc
--
sample_uniform
sample_normal
sample_gamma
sample_exponential
sample_poisson
sample_negative_binomial
sample_generalized_negative_binomial

The only difference that I can glean from the documentation is that
operators in multisample_op.ccperforms concurrent sampling from multiple
distributions, but the behavior of the operators is not different.

Is sample_op.cc being retained for legacy reasons or backward
compatibility? Can it be deprecated or EOLed? Correct me if I am wrong here.


Thanks

Anirudh


Re: Publish MXNet images to DockerHub

2018-07-22 Thread Anirudh Acharya
@Naveen No, I meant in general, for all bindings. Irrespective of whether
we use a package management repository, being able to pull an image from
docker hub would be convenient for anyone wanting to get started on MXNet
or run services( as Kellen said).


On Sun, Jul 22, 2018 at 11:20 AM kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> I think it's a good idea Anirudh.  It should help users easily get MXNet up
> and running whether they're running services, following tutorials, etc.
>
> On Sun, Jul 22, 2018 at 8:10 AM Naveen Swamy  wrote:
>
> > I don't think we need for JVM languages, they have a good dependency
> > management through Maven Central. We weren't publishing regularly to
> Maven,
> > now we do.
> >
> > Anirudh, I am guessing you are interested docker for R language, If the R
> > packages were published to CRAN do you still see a need for docker ?
> Could
> > you elaborate how this would be helpful and easy if they were to use
> other
> > packages in CRAN?
> >
> > On Sat, Jul 21, 2018 at 10:51 PM, Anirudh Acharya  >
> > wrote:
> >
> > > Yes, correct cu90 is indeed there, thanks for pointing it.
> > >
> > > So the question, should we be publishing to Docker Hub as part of the
> > > release process so that bindings other than python are also published
> and
> > > there is a policy on what cuda versions we publish?
> > >
> > >
> > > Thanks
> > > ANirudh
> > >
> > > On Sat, Jul 21, 2018 at 9:56 PM Mu Li  wrote:
> > >
> > > > cu90 and cu90mkl are also available, see
> > > > https://hub.docker.com/r/mxnet/python/tags/
> > > >
> > > > On Sat, Jul 21, 2018 at 9:51 PM, Anirudh Acharya <
> > anirudhk...@gmail.com>
> > > > wrote:
> > > >
> > > > > The python binding that is actively maintained is
> > > > >
> > > > > mxnet-mkl  1.2.1
> > > > >
> > > > >
> > > > > Other versions that use CUDA like mxnet-cu and mxnet-cumkl
> > are
> > > > not
> > > > > actively maintained.
> > > > >
> > > > >
> > > > >
> > > > > -
> > > > >
> > > > > Anirudh
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Jul 21, 2018 at 9:09 PM Mu Li  wrote:
> > > > >
> > > > > > Surprisingly only the python binding is actively maintained. I
> > > remember
> > > > > we
> > > > > > can easily push all bindings into docker hub through the script
> in
> > > > > > https://github.com/apache/incubator-mxnet/tree/master/docker.
> > > > > >
> > > > > > On Sat, Jul 21, 2018 at 5:03 PM, Anirudh Acharya <
> > > > anirudhk...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Docker Hub( https://hub.docker.com/u/mxnet/ ) currently hosts
> > > images
> > > > > of
> > > > > > > MXNet and its various bindings but it is not actively
> maintained.
> > > > > Should
> > > > > > we
> > > > > > > publish MXNet images to Docker Hub as part of the release
> process
> > > and
> > > > > > > actively maintain it?
> > > > > > >
> > > > > > > The pros of publishing docker images would be ease of use and
> > > access
> > > > to
> > > > > > our
> > > > > > > users. Is this something that should be included as part of the
> > > > release
> > > > > > > process? What does the community think?
> > > > > > >
> > > > > > > Thanks
> > > > > > > Anirudh Acharya
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Publish MXNet images to DockerHub

2018-07-21 Thread Anirudh Acharya
Yes, correct cu90 is indeed there, thanks for pointing it.

So the question, should we be publishing to Docker Hub as part of the
release process so that bindings other than python are also published and
there is a policy on what cuda versions we publish?


Thanks
ANirudh

On Sat, Jul 21, 2018 at 9:56 PM Mu Li  wrote:

> cu90 and cu90mkl are also available, see
> https://hub.docker.com/r/mxnet/python/tags/
>
> On Sat, Jul 21, 2018 at 9:51 PM, Anirudh Acharya 
> wrote:
>
> > The python binding that is actively maintained is
> >
> > mxnet-mkl  1.2.1
> >
> >
> > Other versions that use CUDA like mxnet-cu and mxnet-cumkl are
> not
> > actively maintained.
> >
> >
> >
> > -
> >
> > Anirudh
> >
> >
> >
> >
> >
> > On Sat, Jul 21, 2018 at 9:09 PM Mu Li  wrote:
> >
> > > Surprisingly only the python binding is actively maintained. I remember
> > we
> > > can easily push all bindings into docker hub through the script in
> > > https://github.com/apache/incubator-mxnet/tree/master/docker.
> > >
> > > On Sat, Jul 21, 2018 at 5:03 PM, Anirudh Acharya <
> anirudhk...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Docker Hub( https://hub.docker.com/u/mxnet/ ) currently hosts images
> > of
> > > > MXNet and its various bindings but it is not actively maintained.
> > Should
> > > we
> > > > publish MXNet images to Docker Hub as part of the release process and
> > > > actively maintain it?
> > > >
> > > > The pros of publishing docker images would be ease of use and access
> to
> > > our
> > > > users. Is this something that should be included as part of the
> release
> > > > process? What does the community think?
> > > >
> > > > Thanks
> > > > Anirudh Acharya
> > > >
> > >
> >
>


Re: Publish MXNet images to DockerHub

2018-07-21 Thread Anirudh Acharya
The python binding that is actively maintained is

mxnet-mkl  1.2.1


Other versions that use CUDA like mxnet-cu and mxnet-cumkl are not
actively maintained.



-

Anirudh





On Sat, Jul 21, 2018 at 9:09 PM Mu Li  wrote:

> Surprisingly only the python binding is actively maintained. I remember we
> can easily push all bindings into docker hub through the script in
> https://github.com/apache/incubator-mxnet/tree/master/docker.
>
> On Sat, Jul 21, 2018 at 5:03 PM, Anirudh Acharya 
> wrote:
>
> > Hi,
> >
> > Docker Hub( https://hub.docker.com/u/mxnet/ ) currently hosts images of
> > MXNet and its various bindings but it is not actively maintained. Should
> we
> > publish MXNet images to Docker Hub as part of the release process and
> > actively maintain it?
> >
> > The pros of publishing docker images would be ease of use and access to
> our
> > users. Is this something that should be included as part of the release
> > process? What does the community think?
> >
> > Thanks
> > Anirudh Acharya
> >
>


Publish MXNet images to DockerHub

2018-07-21 Thread Anirudh Acharya
Hi,

Docker Hub( https://hub.docker.com/u/mxnet/ ) currently hosts images of
MXNet and its various bindings but it is not actively maintained. Should we
publish MXNet images to Docker Hub as part of the release process and
actively maintain it?

The pros of publishing docker images would be ease of use and access to our
users. Is this something that should be included as part of the release
process? What does the community think?

Thanks
Anirudh Acharya


[ANNOUNCE] Apache MXNet (incubating) 1.2.1 Release

2018-07-20 Thread Anirudh Subramanian
Hello all,

The Apache MXNet (incubating) Community announces the availability of
Apache MXNet (incubating) 1.2.1!


Apache MXNet (incubating) is a deep learning framework designed for
both efficiency and flexibility. It allows you to mix symbolic and
imperative programming to maximize efficiency and productivity.

This release contains bug fixes and performance improvements.

A full list of the changes in this release can be found in the release
notes:
https://github.com/apache/incubator-mxnet/releases/tag/1.2.1


A Link to the Download is here:
https://www.apache.org/dyn/closer.cgi/incubator/mxnet/1.2.1


If you prefer to build from source and experiment with various
compile-time configuration options, use this link to get the
instructions:
http://mxnet.incubator.apache.org/install/index.html

Or You can download and play with MXNet easily using one of the options
below:
   1. The Pip package can be found here: https://pypi.python.org/pypi/mxnet
   2. The Docker Images can be found here:
https://hub.docker.com/r/mxnet/python/

Links to published scala packages in Maven:
https://mvnrepository.com/search?q=org.apache.mxnet
https://repository.apache.org/content/repositories/releases/org/apache/mxnet/


The release tag used for the 1.2.1 release is:
https://github.com/apache/incubator-mxnet/tree/1.2.1

Some more MXNet Resources:
   1. Issues: https://github.com/apache/incubator-mxnet/issues
   2. Wiki: https://cwiki.apache.org/confluence/display/MXNET
   3. Twitter: @ApacheMXNet

   4. YouTube: Apachemxnet channel

   5. Medium: https://medium.com/apache-mxnet

   6. Reddit: /r/mxnet




If you want to learn more about MXNet visit
http://mxnet.incubator.apache.org/

Finally, you are welcome to join and also invite your friends to the
dynamic and growing MXNet community by subscribing to
dev@mxnet.incubator.apache.org

Thanks!

Apache MXNet (incubating) Team
___


DISCLAIMER:

Apache MXNet (incubating) is an effort undergoing incubation at The

Apache Software Foundation (ASF), sponsored by the name of Apache

Incubator PMC. Incubation is required of all newly accepted

projects until a further review indicates that the

infrastructure, communications, and decision making process have

stabilized in a manner consistent with other successful ASF

projects. While incubation status is not necessarily a reflection

of the completeness or stability of the code, it does indicate

that the project has yet to be fully endorsed by the ASF.


Re: Release plan - MXNET 1.3

2018-07-19 Thread Anirudh Acharya
@sandeep krishnamurthy  the bug fixes in the
R-package is something we have just begun, there will not be anything
significant to announce before the v1.3 code freeze.

On Wed, Jul 18, 2018 at 10:08 PM Steffen Rochel 
wrote:

> To make it easier to find the discussions related to project proposals I
> added a column with a link to the thread on dev@ for most projects.
> Appreciate for the project owners to fill in the blanks and to check that I
> got the right threads.
>
> Regards,
> Steffen
>
> On Wed, Jul 18, 2018 at 7:11 PM Roshani Nagmote  >
> wrote:
>
> > Hi Kellen,
> >
> > Sure. I will update the notes with the information.
> >
> > Thanks,
> > Roshani
> >
> > On Wed, Jul 18, 2018 at 3:01 PM kellen sunderland <
> > kellen.sunderl...@gmail.com> wrote:
> >
> > > Hey Roshani,
> > >
> > > Would you be able to add 'TensorRT Runtime Integration' to the list of
> > > upcoming features?  We'll target getting the release in and polished by
> > the
> > > 23rd.  Design proposal is here:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Runtime+Integration+with+TensorRT
> > > and the lead contributor is Marek Kolodziej.
> > >
> > > -Kellen
> > >
> > > On Wed, Jul 18, 2018 at 8:32 PM Roshani Nagmote <
> > roshaninagmo...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am starting the process to prepare for Apache MXNet (incubating)
> 1.3
> > > > Release. Please find project proposal draft for this release here:
> > > > <*
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> > > > >*
> > > > >
> > > >
> > > > Target feature freeze date is July 23rd. A release candidate will be
> > cut
> > > > around Monday, August 6th and voting will commence from then until
> > > > Thursday, August 9th. If you have any additional features in progress
> > and
> > > > would like to include it in this release, please make sure to comment
> > so
> > > I
> > > > can update the release notes.
> > > >
> > > > Feel free to add any other comments/suggestions.
> > > >
> > > > Thanks,
> > > > Roshani
> > > >
> > >
> >
>


Re: [VOTE] Subscribe dev@ to Github Activities

2018-07-17 Thread Anirudh Acharya
+1

On Tue, Jul 17, 2018 at 9:58 AM Anirudh  wrote:

> Its not foregoing transparency since people can easily subscribe to the
> github activities individually. dev@ has been used till now for design
> discussions, other project discussions,
> votes etc. After we subscribe dev@ to all activities, I am afraid dev@
> will
> be reduced to a forwarded mail box and it is redundant for most purposes.
>
> Anirudh
>
> On Tue, Jul 17, 2018 at 9:26 AM, Sheng Zha  wrote:
>
> > Hi Anirudh,
> >
> > 1. You need exactly one filter to filter out all the github notifications
> > on PRs and issues: "from:notificati...@github.com", and you'd get your
> S/N
> > ratio back.
> > 2. Having the option to do design discussion on an issue or PR is
> actually
> > a good thing as many discussions are quite small and better accompanied
> by
> > code. If for some reason a merged design needs revisiting, there's still
> > the option of sending an email to dev@ and discuss about it.
> > 3. About votes, commit vote (and veto) can already happen on PR per past
> > agreement. The discussion for procedural vote IMO should be allowed to
> > happen on Github if it's development related. Procedural votes themselves
> > should and can still happen on dev@.
> >
> > About "you don't really have to do anything explicitly on the dev@
> list",
> > besides the above arguments, we don't send emails to dev@ just for the
> > purpose of sending it. On the other hand, since "whatever didn't happen
> on
> > dev list didn't happen", we'd need better arguments on why we'd choose to
> > forego the transparency.
> >
> > -sz
> >
> > On Tue, Jul 17, 2018 at 8:47 AM, Anirudh  wrote:
> >
> > > -1
> > >
> > > The low signal to noise ratio would mean that we may miss important
> > emails.
> > > Even with the different filters that we may setup for dev@, the emails
> > > would be too many to not miss the important ones. We would see more and
> > > more people starting a design discussion on an issue or PR. Because of
> > the
> > > low signal to noise ratio on the dev@ list, many may miss these
> > > discussions.
> > >
> > > Slowly, this would erode the purpose of the dev@ list as this means
> that
> > > you don't really have to do anything explicitly on the dev@ list.
> > > You can start a design discussion on a github issue. You can start a
> > > vote/discussion on a github issue.
> > >
> > > Anirudh
> > >
> > > On Mon, Jul 16, 2018 at 4:35 AM, Timur Shenkao 
> > wrote:
> > >
> > > > +1 if my vote can be taken into account
> > > >
> > > > On Mon, Jul 16, 2018 at 4:32 AM, Sheng Zha 
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm starting a vote on subscribing dev@ to Github activities. See
> > > > previous
> > > > > discussion thread here
> > > > > <
> https://lists.apache.org/thread.html/3d883f6a3cbc8e81e810962e0c0fe7
> > > > > bfd01f0b78d3cb44034f566442@%3Cdev.mxnet.apache.org%3E>
> > > > > .
> > > > >
> > > > > The vote lasts for three days and ends on 7/18/2018 at 9pm pst.
> > > > >
> > > > > -sz
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Subscribe dev@ to Github Activities

2018-07-17 Thread Anirudh
Its not foregoing transparency since people can easily subscribe to the
github activities individually. dev@ has been used till now for design
discussions, other project discussions,
votes etc. After we subscribe dev@ to all activities, I am afraid dev@ will
be reduced to a forwarded mail box and it is redundant for most purposes.

Anirudh

On Tue, Jul 17, 2018 at 9:26 AM, Sheng Zha  wrote:

> Hi Anirudh,
>
> 1. You need exactly one filter to filter out all the github notifications
> on PRs and issues: "from:notificati...@github.com", and you'd get your S/N
> ratio back.
> 2. Having the option to do design discussion on an issue or PR is actually
> a good thing as many discussions are quite small and better accompanied by
> code. If for some reason a merged design needs revisiting, there's still
> the option of sending an email to dev@ and discuss about it.
> 3. About votes, commit vote (and veto) can already happen on PR per past
> agreement. The discussion for procedural vote IMO should be allowed to
> happen on Github if it's development related. Procedural votes themselves
> should and can still happen on dev@.
>
> About "you don't really have to do anything explicitly on the dev@ list",
> besides the above arguments, we don't send emails to dev@ just for the
> purpose of sending it. On the other hand, since "whatever didn't happen on
> dev list didn't happen", we'd need better arguments on why we'd choose to
> forego the transparency.
>
> -sz
>
> On Tue, Jul 17, 2018 at 8:47 AM, Anirudh  wrote:
>
> > -1
> >
> > The low signal to noise ratio would mean that we may miss important
> emails.
> > Even with the different filters that we may setup for dev@, the emails
> > would be too many to not miss the important ones. We would see more and
> > more people starting a design discussion on an issue or PR. Because of
> the
> > low signal to noise ratio on the dev@ list, many may miss these
> > discussions.
> >
> > Slowly, this would erode the purpose of the dev@ list as this means that
> > you don't really have to do anything explicitly on the dev@ list.
> > You can start a design discussion on a github issue. You can start a
> > vote/discussion on a github issue.
> >
> > Anirudh
> >
> > On Mon, Jul 16, 2018 at 4:35 AM, Timur Shenkao 
> wrote:
> >
> > > +1 if my vote can be taken into account
> > >
> > > On Mon, Jul 16, 2018 at 4:32 AM, Sheng Zha  wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm starting a vote on subscribing dev@ to Github activities. See
> > > previous
> > > > discussion thread here
> > > > <https://lists.apache.org/thread.html/3d883f6a3cbc8e81e810962e0c0fe7
> > > > bfd01f0b78d3cb44034f566442@%3Cdev.mxnet.apache.org%3E>
> > > > .
> > > >
> > > > The vote lasts for three days and ends on 7/18/2018 at 9pm pst.
> > > >
> > > > -sz
> > > >
> > >
> >
>


Re: [DISCUSS] Subscribe dev@ to Github Activities?

2018-07-12 Thread Anirudh Acharya
For concerns regarding signal and noise, I think we can get around that by
setting up the right kind of filters in the mail client.
Signal and Noise can be different for different people.


Regards,
Anirudh

On Thu, Jul 12, 2018 at 3:34 PM Haibin Lin  wrote:

> Agree. +1 for more transparency
>
> On Thu, Jul 12, 2018 at 3:27 PM, Zha, Sheng 
> wrote:
>
> > My intention is really just to bridge the gap between so much happening
> on
> > github v.s. "whatever didn't happen on dev list didn't happen".
> >
> > Also, since dev@ is intended to be an asynchronous way for community to
> > follow technical conversations, there wasn't really a requirement for
> > anyone to read all of them in the first place.
> >
> > Best regards,
> > -sz
> >
> > On 7/12/18, 3:20 PM, "Timur Shenkao"  wrote:
> >
> > Flink - yes
> > Spark - it was previously but not now
> >
> > Yeah, amount of messages would be tripled at least: Jira + Github
> > issue + PR
> >
> > On Thu, Jul 12, 2018 at 11:13 PM, Haibin Lin <
> haibin.lin@gmail.com
> > >
> > wrote:
> >
> > > I'm a bit concerned with the amount of emails flooding in. In the
> > past week
> > > there're 32 new issues and 35 new pull requests. This means on avg
> > 10 email
> > > per day and I doubt I'll read all of them.. Does the Spark
> community
> > > subscribe dev@ to github?
> > >
> > > Best,
> > > Haibin
> > >
> > > On Thu, Jul 12, 2018 at 3:08 PM, Pedro Larroy <
> > > pedro.larroy.li...@gmail.com>
> > > wrote:
> > >
> > > > -1   It's a lot of traffic, whomever wants to subscribe can do it
> > in
> > > > github. I'm afraid it will decrease signal to noise ratio in the
> > list.
> > > >
> > > > On Thu, Jul 12, 2018 at 11:32 PM Lin Yuan 
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Thu, Jul 12, 2018 at 12:26 PM Anirudh Acharya <
> > > anirudhk...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > On Thu, Jul 12, 2018 at 11:51 AM Piyush Ghai <
> > ghai.piy...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > > > On Jul 12, 2018, at 11:50 AM, Tianqi Chen <
> > > > tqc...@cs.washington.edu>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Thu, Jul 12, 2018 at 11:10 AM, Sheng Zha <
> > szha@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi all,
> > > > > > > >>
> > > > > > > >> Should we subscribe dev list to github updates on mxnet
> > repo?
> > > Both
> > > > > > > github
> > > > > > > >> issues/PRs and the dev list are intended for technical
> > > discussions
> > > > > and
> > > > > > > in
> > > > > > > >> that aspect largely share the same goal. Since MXNet has
> > most
> > > > > activity
> > > > > > > >> github, this could help dev@ to become more active.
> Some
> > pros
> > > and
> > > > > > cons:
> > > > > > > >>
> > > > > > > >> Pros:
> > > > > > > >> - There have been many high quality discussions that
> > happen on
> > > > > github
> > > > > > to
> > > > > > > >> which the dev list can benefit.
> > > > > > > >> - Replies on update emails are reflected on the specific
> > > issue/PR.
> > > > > > > >> - Users can also choose to click on the link and go to
> > github to
> > > > > > > >> participate in discussion.
> > > > > > > >> - We still have the ability to carry out dev@ only
> > > conversation.
> > > > > > > >>
> > > > > > > >> Cons:
> > > > > > > >> - Higher volume on dev list.
> > > > > > > >> - Some discussions might not be suitable for dev@.
> > (though I
> > > > can't
> > > > > > > think
> > > > > > > >> of
> > > > > > > >> why such conversation should happen on github either)
> > > > > > > >>
> > > > > > > >> -sz
> > > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
>


Re: [DISCUSS] Subscribe dev@ to Github Activities?

2018-07-12 Thread Anirudh Acharya
+1

On Thu, Jul 12, 2018 at 11:51 AM Piyush Ghai  wrote:

> +1
> > On Jul 12, 2018, at 11:50 AM, Tianqi Chen 
> wrote:
> >
> > +1
> >
> > On Thu, Jul 12, 2018 at 11:10 AM, Sheng Zha  wrote:
> >
> >> Hi all,
> >>
> >> Should we subscribe dev list to github updates on mxnet repo? Both
> github
> >> issues/PRs and the dev list are intended for technical discussions and
> in
> >> that aspect largely share the same goal. Since MXNet has most activity
> >> github, this could help dev@ to become more active. Some pros and cons:
> >>
> >> Pros:
> >> - There have been many high quality discussions that happen on github to
> >> which the dev list can benefit.
> >> - Replies on update emails are reflected on the specific issue/PR.
> >> - Users can also choose to click on the link and go to github to
> >> participate in discussion.
> >> - We still have the ability to carry out dev@ only conversation.
> >>
> >> Cons:
> >> - Higher volume on dev list.
> >> - Some discussions might not be suitable for dev@. (though I can't
> think
> >> of
> >> why such conversation should happen on github either)
> >>
> >> -sz
> >>
>
>


Re: C++ api issue labeling

2018-07-10 Thread Anirudh Acharya
There is another instance of label duplication - We have labels "Feature" (
https://github.com/apache/incubator-mxnet/labels/Feature ) and "Feature
Request" (
https://github.com/apache/incubator-mxnet/labels/Feature%20request ). I
don't think there is much difference between these two labels.

It would make sense to merge the "Feature" label into "Feature Request".


Thanks
Anirudh


On Wed, Jun 27, 2018 at 3:50 PM Hagay Lupesko  wrote:

> Thank you everyone for your suggestions.
> I will work with a committer to get this updated ASAP.
>
> On Mon, Jun 25, 2018 at 8:55 AM Marco de Abreu
>  wrote:
>
> > +1 to renaming to Backend
> >
> > On Mon, Jun 25, 2018 at 10:13 AM Hagay Lupesko 
> wrote:
> >
> > > Thanks Lin for your feedback.
> > > Bumping again to get more feedback before concluding.
> > >
> > > On Fri, Jun 22, 2018 at 8:53 AM Lin Yuan  wrote:
> > >
> > > > I agree with Hagay. Using "Backend" as label makes it much easier to
> > > track.
> > > >  "C++" label only describes the language used in implementation,
> > > "Backend"
> > > > better describes the nature of the work (let's assume we change the
> > > backend
> > > > implementation from C++ to other languages in the future).
> > > >
> > > > Lin
> > > >
> > > > On Fri, Jun 22, 2018 at 1:09 AM Hagay Lupesko 
> > wrote:
> > > >
> > > > > Thanks everyone for chiming in and clarifying.
> > > > > It seems that the "C++" label name is confusing for our community
> > since
> > > > it
> > > > > can be interpreted as both the CPP API and the backend...
> > > > > As an anecdote, this issue [1
> > > > > <https://github.com/apache/incubator-mxnet/issues/10937>] is
> labeled
> > > as
> > > > > "C++" but is about the CPP API, not the backend.
> > > > >
> > > > > Should we just rename "C++" to "Backend" to avoid confusion?
> > > > >
> > > > > [1] https://github.com/apache/incubator-mxnet/issues/10937
> > > > >
> > > > > On Thu, Jun 21, 2018 at 12:39 PM Pedro Larroy <
> > > > > pedro.larroy.li...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Agree with Anirudh, they are different things. Maybe change the
> > "C++"
> > > > > label
> > > > > > to "backend" would be more informative?
> > > > > >
> > > > > > On Thu, Jun 21, 2018 at 12:11 PM Anirudh 
> > > > wrote:
> > > > > >
> > > > > > > Hi Hagay,
> > > > > > >
> > > > > > > I think we should keep these two labels seperate since they
> mean
> > > > > > different
> > > > > > > things.
> > > > > > > The C++ label refers to the issue for MXNet backend and the CPP
> > > > package
> > > > > > > refers to the CPP language binding for mxnet.
> > > > > > > We can still make C++ API great again irrespective by filtering
> > out
> > > > CPP
> > > > > > > package issues :).
> > > > > > >
> > > > > > > Anirudh
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 21, 2018 at 11:56 AM, Hagay Lupesko <
> > lupe...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hey community,
> > > > > > > >
> > > > > > > > I was going over the open GitHub issues for MXNet, and
> noticed
> > > that
> > > > > we
> > > > > > > have
> > > > > > > > two labels for the CPP API: "CPP package", "C++"
> > > > > > > >
> > > > > > > > Wanted to suggest we remove "CPP package" and just stick to
> > "C++"
> > > > > > > > This will make it easier for the community to classify issues
> > and
> > > > > focus
> > > > > > > on
> > > > > > > > making the C++ API great again ;)
> > > > > > > >
> > > > > > > > Let me know if someone has any concerns, otherwise I will
> find
> > a
> > > > > > > committer
> > > > > > > > that I can work with to make this change.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > > Hagay
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Regarding 1.2.1 Release

2018-07-03 Thread Anirudh
Hi Sergio and MXNet community,


I wanted to summarize the reasons for requesting a new RC of 1.2.1:


The warning message “save_params is deprecated , please use save_parameters
instead” was confusing because the new API ‘save_parameters’ which
deprecated ‘save_params’ could be used in 2 contexts and one of them would
break when used.

1. the new API save_parameters was intended only for imperative gluon
blocks.

2. When save_parameters is used with SymbolBlocks it would break the user
code that loads the model and parameters later.

This was raised on the dev@ list:

https://lists.apache.org/thread.html/8c6fa3ef5a7fa5a90b46293f22da955df3455e80046895dc67dbc8c6@%3Cdev.mxnet.apache.org%3E
and discussed offline about the impact and realized this will lead to bad
user experience and frustration.

This PR fixes the warning message:
https://github.com/apache/incubator-mxnet/pull/11532


What could have been done better?


Voting and initial discussions on the issues in 1.2.1 rc0, happened on dev@
list. However, further discussion and decision for a new RC of the release
happened in person with the committers who raised concerns. We should have
either continued the discussion on dev@ or summarized the discussion and
sought more opinions from the community on dev@.


I ask mentors and community members to suggest any areas of improvement we
can incorporate in such situations to minimize the time spent by community.


Thanks,


Anirudh



On Mon, Jul 2, 2018 at 8:21 PM, Sergio Fernández  wrote:

> Besides that I can't agree with the arguments about the warning at
> 1.2.1-RC1-incubating, but I guess I haven't much to say. Remember that "if
> it didn't happen on a mailing list, it didn't happen".
>
> On Mon, Jul 2, 2018, 17:37 Anirudh  wrote:
>
> > Hi all,
> >
> > After an offline discussion, the current decision is to block the 1.2.1
> > release, improve the warning message for save_params usage here:
> > https://github.com/apache/incubator-mxnet/pull/11532 ,
> > cut a new RC and then restart the voting process.
> >
> > Anirudh
> >
>


Re: Release process for R

2018-07-03 Thread Anirudh Acharya
Hi Marco,

A release process for the R-package would probably involve publishing the
package to CRAN. Currently it is not done and the r-package would need
considerable changes/improvements before it can get published to CRAN.

So at present accessing the package from the S3 bucket is the surest way to
access the R API.


Thanks
Anirudh


On Tue, Jul 3, 2018 at 2:48 AM Marco de Abreu
 wrote:

> Hello,
>
> do we have a release process for our R frontend? I noticed the issue at [1]
> and it seems like we're only publishing to an S3 bucket which is not under
> Apache. Is there another channel for our users to retrieve that package or
> is this our only supported official way?
>
> Best regards,
> Marco
>
> [1] https://github.com/apache/incubator-mxnet/issues/10791
>


Regarding 1.2.1 Release

2018-07-02 Thread Anirudh
Hi all,

After an offline discussion, the current decision is to block the 1.2.1
release, improve the warning message for save_params usage here:
https://github.com/apache/incubator-mxnet/pull/11532 ,
cut a new RC and then restart the voting process.

Anirudh


Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-28 Thread Anirudh
Hi,

I have opened a PR for adding new content to 1.2.1 release notes:
https://github.com/apache/incubator-mxnet/pull/11478
Please help review. Once this PR is approved I will be cutting the release.

Thanks,

Anirudh

On Tue, Jun 26, 2018 at 7:10 PM, Chris Olivier 
wrote:

> +1
>
> On Tue, Jun 26, 2018 at 5:50 PM Anirudh  wrote:
>
> > Hi all,
> >
> > The current warning message for save_params: "save_params is deprecated,
> > use save_parameters instead" is misleading
> > for users who may use the API to load into SymbolBlock. To make it
> clearer
> > for all users, a better warning message is to include alternative to
> > save_params and reference to detailed documentation.
> > The message improvement is important, but not blocking to move forward to
> > complete the voting process for MXNet v1.2.1 patch release.
> > We plan to follow up with a 1.2.2 patch release to improve the message
> and
> > potentially include other bug fixes.
> > Please let me know if you have any thoughts, questions or suggestions.
> >
> >
> > Anirudh
> >
> > On Mon, Jun 25, 2018 at 10:44 PM, Anirudh  wrote:
> >
> > >
> > > Hi Mu,
> > >
> > > Thanks for bringing this up and hopefully this should answer Sheng's
> > > question.
> > > Thomas pointed out something similar in the PR here for the warning
> > > message which I didn't notice back then:
> > > https://github.com/apache/incubator-mxnet/pull/11127
> > >
> > > Not sure about the reasoning to not add it and if there was an offline
> > > discussion about this between Thomas and Erik.
> > > It would be nice if you guys could pitch in if there were any strong
> > > reasons.
> > >
> > > I understand that a more informed warning when using save_params would
> > > really avoid some customer frustration.
> > > Having said that, I am a little worried about the timeline though since
> > > some customers are eagerly waiting for the release of 1.2.1.
> > > Another RC would delay it by at-least one and a half weeks.
> > >
> > > Anirudh
> > >
> > > On Mon, Jun 25, 2018 at 9:54 PM, Mu Li  wrote:
> > >
> > >> Detailed documents should help, but the current warning message that
> > >> "save_params is deprecated, use save_parameters instead" is not
> > sufficient
> > >> enough.
> > >>
> > >> Some details about the API changes:
> > >>
> > >> 1. v1.2 changed the implementation of "save_params", which is
> explained
> > in
> > >> the release note. The main benefit for this change is that we don't
> need
> > >> to
> > >> create layers within a name scope. [1]
> > >> 2. we found this change breaks a gluon-to-symbol usage, even though we
> > >> recommended users to use "export" for this usage. [2]
> > >> 3. for some good reasons we made a decision to revert save_params in
> > >> v1.2.1, and introduced a new API called save_parameters for this new
> > >> behavior. [3]
> > >>
> > >> Since calling save_params each time will generate a warning message,
> > it's
> > >> a
> > >> major API change. The recommended for users to update their codes are:
> > >>
> > >> 1. If you save parameters to load back into a SymbolBlock, you can use
> > >> export instead, though keeping it will not break your codes except
> for a
> > >> warning message. (But it will break in v1.2)
> > >> 2. If you create gluon layers without a name scope, you must replace
> > >> save_params with save_parameters. Otherwise, your model cannot be
> loaded
> > >> back in v1.2.1 (though it works in v1.2)
> > >> 3. For the rest case, such as models are created within a name scope,
> > and
> > >> the models are loaded into gluon (not symbolblock) later, recommend
> > >> replacing save_params with save_parameteres. If you don't do it,
> nothing
> > >> will break in v1.2 and v1.2.1, but v1.2.1 will give you a warning
> > message.
> > >>
> > >> This API changes in v1.2 and v1.2.1 are pretty tricky. Anirudh did a
> > great
> > >> job in capturing them in release notes. But I feel it's hard for users
> > to
> > >> understand the impacts. I suggest to improve the warning message to
> "use
> > >> export if you want to load into SymbolBlock, otherwise use
> > >> save_par

Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-26 Thread Anirudh
Hi all,

The current warning message for save_params: "save_params is deprecated,
use save_parameters instead" is misleading
for users who may use the API to load into SymbolBlock. To make it clearer
for all users, a better warning message is to include alternative to
save_params and reference to detailed documentation.
The message improvement is important, but not blocking to move forward to
complete the voting process for MXNet v1.2.1 patch release.
We plan to follow up with a 1.2.2 patch release to improve the message and
potentially include other bug fixes.
Please let me know if you have any thoughts, questions or suggestions.


Anirudh

On Mon, Jun 25, 2018 at 10:44 PM, Anirudh  wrote:

>
> Hi Mu,
>
> Thanks for bringing this up and hopefully this should answer Sheng's
> question.
> Thomas pointed out something similar in the PR here for the warning
> message which I didn't notice back then:
> https://github.com/apache/incubator-mxnet/pull/11127
>
> Not sure about the reasoning to not add it and if there was an offline
> discussion about this between Thomas and Erik.
> It would be nice if you guys could pitch in if there were any strong
> reasons.
>
> I understand that a more informed warning when using save_params would
> really avoid some customer frustration.
> Having said that, I am a little worried about the timeline though since
> some customers are eagerly waiting for the release of 1.2.1.
> Another RC would delay it by at-least one and a half weeks.
>
> Anirudh
>
> On Mon, Jun 25, 2018 at 9:54 PM, Mu Li  wrote:
>
>> Detailed documents should help, but the current warning message that
>> "save_params is deprecated, use save_parameters instead" is not sufficient
>> enough.
>>
>> Some details about the API changes:
>>
>> 1. v1.2 changed the implementation of "save_params", which is explained in
>> the release note. The main benefit for this change is that we don't need
>> to
>> create layers within a name scope. [1]
>> 2. we found this change breaks a gluon-to-symbol usage, even though we
>> recommended users to use "export" for this usage. [2]
>> 3. for some good reasons we made a decision to revert save_params in
>> v1.2.1, and introduced a new API called save_parameters for this new
>> behavior. [3]
>>
>> Since calling save_params each time will generate a warning message, it's
>> a
>> major API change. The recommended for users to update their codes are:
>>
>> 1. If you save parameters to load back into a SymbolBlock, you can use
>> export instead, though keeping it will not break your codes except for a
>> warning message. (But it will break in v1.2)
>> 2. If you create gluon layers without a name scope, you must replace
>> save_params with save_parameters. Otherwise, your model cannot be loaded
>> back in v1.2.1 (though it works in v1.2)
>> 3. For the rest case, such as models are created within a name scope, and
>> the models are loaded into gluon (not symbolblock) later, recommend
>> replacing save_params with save_parameteres. If you don't do it, nothing
>> will break in v1.2 and v1.2.1, but v1.2.1 will give you a warning message.
>>
>> This API changes in v1.2 and v1.2.1 are pretty tricky. Anirudh did a great
>> job in capturing them in release notes. But I feel it's hard for users to
>> understand the impacts. I suggest to improve the warning message to "use
>> export if you want to load into SymbolBlock, otherwise use
>> save_parameters.
>> For more details, refer to this URL".
>>
>> [1] https://github.com/apache/incubator-mxnet/releases/tag/1.2.0
>> [2] https://github.com/apache/incubator-mxnet/issues/11091
>> [3] https://github.com/apache/incubator-mxnet/pull/11127
>>
>> On Mon, Jun 25, 2018 at 9:23 PM, Sheng Zha  wrote:
>>
>> > Wouldn’t this break users who are on 1.2.0 and used our API correctly?
>> Why
>> > do we have to revert load_params, given that it’s backward compatible?
>> >
>> > -sz
>> >
>> > > On Jun 25, 2018, at 6:30 PM, Anirudh  wrote:
>> > >
>> > > Hi,
>> > >
>> > > 1.2.1 (load_params) is backward compatible with 1.1.0 not with 1.2.0.
>> > > It does not adhere exactly with semver but it had to be made, to
>> quickly
>> > > help our customers who were using the APIs incorrectly.
>> > >
>> > > Anirudh
>> > >
>> > >> On Mon, Jun 25, 2018 at 5:42 PM, Sheng Zha 
>> wrote:
>> > >>
>> > >> save_parameters didn't exist in 1.2.0 so its addition usually isn't
>

Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-25 Thread Anirudh
Hi Mu,

The warining currently printed is "save_params is deprecated. Please use
save_parameters."
Isn't this similar to what you are suggesting ?

Anirudh

On Mon, Jun 25, 2018 at 3:47 PM, Mu Li  wrote:

> v1.2.1 will print a deprecating warning message when calling
> save_params. We should tell users clearly to replace "save_params" with
> "save_parameters" or something else.
>
> On Mon, Jun 18, 2018 at 6:52 PM, Anirudh  wrote:
>
> > Hi,
> >
> > This is the vote to release Apache MXNet (incubating) version 1.2.1.
> Voting
> > will start now and close Thursday June 21st 7:00 PM PDT.
> >
> > Link to release candidate 1.2.1.rc0:
> >
> > https://github.com/apache/incubator-mxnet/releases/tag/1.2.1.rc0
> >
> > View this page for installation instructions:
> >
> > https://mxnet.incubator.apache.org/install/index.html
> >
> > (Note: The README.md points to the 1.2.1 tag and does not work at the
> > moment).
> >
> > Please remember to test first before voting accordingly.
> >
> > +1 = approve
> > +0 = no opinion
> > -1 = disapprove (provide reason)
> >
> > Anirudh
> >
>


Re: [RELEASE][VOTE] Release MXNet version 1.2.1.RC0

2018-06-22 Thread Anirudh
Does PMC in this page mean IPMC :
https://www.apache.org/foundation/voting.html#ReleaseVotes ?
Also, does this mean we need three IPMC votes to pass this release on dev
list ?

Anirudh

On Fri, Jun 22, 2018 at 9:15 PM, Sergio Fernández  wrote:

> Just wanted to refresh what
> https://incubator.apache.org/guides/ppmc.html#ppmc_and_binding_votes says:
> "The only time when a PPMC member’s vote is binding is for the addition of
> new PPMC members and committers. Release votes are only binding to IPMC
> members.".
>
> So it's incorrect to mark as binding those votes at the RESULT email.
>
>
> On Fri, Jun 22, 2018, 17:38 Chris Olivier  wrote:
>
> > what do you mean? just curious.
> >
> > On Fri, Jun 22, 2018 at 4:44 PM Sergio Fernández 
> > wrote:
> >
> > > Please, notice PPMC votes are not binding.
> > >
> > > On Fri, Jun 22, 2018, 09:35 Anirudh  wrote:
> > >
> > > > Hi all,
> > > >
> > > > Apologies for replying instead of sending out a new email.
> > > >
> > > > This vote has passed with 6 +1s:
> > > >
> > > > Binding:
> > > > Sandeep
> > > > Haibin
> > > > Indhu
> > > >
> > > > Non Binding:
> > > > Carin
> > > > Pedro
> > > > Lai
> > > >
> > > > I will proceed with the vote on general@.
> > > >
> > > > Thanks,
> > > > Anirudh
> > > >
> > >
> >
>


[RELEASE][VOTE] Release MXNet version 1.2.1.RC0

2018-06-22 Thread Anirudh
Hi all,

Apologies for replying instead of sending out a new email.

This vote has passed with 6 +1s:

Binding:
Sandeep
Haibin
Indhu

Non Binding:
Carin
Pedro
Lai

I will proceed with the vote on general@.

Thanks,
Anirudh


Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-22 Thread Anirudh
Hi all,

Thanks a lot for checking the release. This vote has passed with:

6 +1s

Binding:
Sandeep
Haibin
Indhu

Non Binding:
Carin
Pedro
Lai

Anirudh


On Fri, Jun 22, 2018 at 8:00 AM, sandeep krishnamurthy <
sandeep.krishn...@gmail.com> wrote:

> +1
>
> Lai (https://github.com/roywei) and myself tested with Keras-MXNet for CNN
> and RNN standard use cases, things are working as expected on CPU and GPU.
>
> Best,
> Sandeep
>
> On Thu, Jun 21, 2018 at 11:06 PM Anirudh  wrote:
>
> > Hi all,
> >
> > Thanks for checking the release. We need one more binding +1. I would
> > request committers to help out here so that we can get the vote started
> on
> > general@.
> >
> > Anirudh
> >
> > On Thu, Jun 21, 2018 at 3:06 PM, Pedro Larroy <
> > pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > I already changed my vote to +1   I don't think this a big issue, just
> a
> > > pixel difference of 1 when loading an image.
> > >
> > >
> > >
> > > On Thu, Jun 21, 2018 at 2:20 PM Marco de Abreu
> > >  wrote:
> > >
> > > > We had 868 successful runs and no failures for that test so far,
> Pedro.
> > > >
> > > > On Thu, Jun 21, 2018 at 11:17 PM Pedro Larroy <
> > > > pedro.larroy.li...@gmail.com>
> > > > wrote:
> > > >
> > > > > Observed just one failure in OSX in test_imdecode:
> > > > >
> > > > >
> > > > >
> > ==
> > > > > FAIL: test_imdecode (test_image.TestImage)
> > > > >
> > --
> > > > > Traceback (most recent call last):
> > > > >   File
> > > > >
> > > > >
> > > > "/Users/pllarroy/devel/mxnet/mxnet_release/tests/python/
> > > unittest/test_image.py",
> > > > > line 94, in test_imdecode
> > > > > assert_almost_equal(image.asnumpy(), cv_image)
> > > > >   File
> > > > > "/Users/pllarroy/devel/mxnet/mxnet_release/python/mxnet/
> > > test_utils.py",
> > > > > line 493, in assert_almost_equal
> > > > > raise AssertionError(msg)
> > > > > AssertionError:
> > > > > Items are not equal:
> > > > > Error 980.392157 exceeds tolerance rtol=0.10, atol=0.00.
> > > > Location
> > > > > of maximum error:(100, 457, 1), a=103.00, b=102.00
> > > > >  a: array([[[  0,  10,  20],
> > > > > [  0,  11,  19],
> > > > > [  2,  12,  19],...
> > > > >  b: array([[[  0,  10,  20],
> > > > > [  0,  11,  19],
> > > > > [  2,  12,  19],...
> > > > >
> > > > >
> > --
> > > > > Ran 505 tests in 2018.430s
> > > > >
> > > > >
> > > > > Is this one known?
> > > > >
> > > > > The other unit tests passed.
> > > > >
> > > > > On Thu, Jun 21, 2018 at 2:09 PM Haibin Lin <
> haibin.lin@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Built from source with CUDA on Ubuntu.
> > > > > >
> > > > > > Ran example/gluon/word_language_model/train.py
> > > > > >
> > > > > > Best,
> > > > > > Haibin
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 21, 2018 at 11:08 AM, Anirudh  >
> > > > wrote:
> > > > > >
> > > > > > > Hi Pedro,
> > > > > > >
> > > > > > > I think you raised this issue in 1.2.0 release here:
> > > > > > > https://lists.apache.org/thread.html/
> > > ddc088a21aac179144350ea97353a7
> > > > > > > ea885b2765ccb98db08a03ba2d@%3Cdev.mxnet.apache.org%3E
> > > > > > > .
> > > > > > > I actually forgot about this issue during this release. Having
> > said
> > > > > > that, I
> > > > > > > think since this works with make and the customers using cmake
> > with
> > > > > > > USE_OPENM

Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-22 Thread Anirudh
Hi all,

Thanks for checking the release. We need one more binding +1. I would
request committers to help out here so that we can get the vote started on
general@.

Anirudh

On Thu, Jun 21, 2018 at 3:06 PM, Pedro Larroy 
wrote:

> I already changed my vote to +1   I don't think this a big issue, just a
> pixel difference of 1 when loading an image.
>
>
>
> On Thu, Jun 21, 2018 at 2:20 PM Marco de Abreu
>  wrote:
>
> > We had 868 successful runs and no failures for that test so far, Pedro.
> >
> > On Thu, Jun 21, 2018 at 11:17 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Observed just one failure in OSX in test_imdecode:
> > >
> > >
> > > ==
> > > FAIL: test_imdecode (test_image.TestImage)
> > > --
> > > Traceback (most recent call last):
> > >   File
> > >
> > >
> > "/Users/pllarroy/devel/mxnet/mxnet_release/tests/python/
> unittest/test_image.py",
> > > line 94, in test_imdecode
> > > assert_almost_equal(image.asnumpy(), cv_image)
> > >   File
> > > "/Users/pllarroy/devel/mxnet/mxnet_release/python/mxnet/
> test_utils.py",
> > > line 493, in assert_almost_equal
> > > raise AssertionError(msg)
> > > AssertionError:
> > > Items are not equal:
> > > Error 980.392157 exceeds tolerance rtol=0.10, atol=0.00.
> > Location
> > > of maximum error:(100, 457, 1), a=103.00, b=102.00
> > >  a: array([[[  0,  10,  20],
> > > [  0,  11,  19],
> > > [  2,  12,  19],...
> > >  b: array([[[  0,  10,  20],
> > > [  0,  11,  19],
> > > [  2,  12,  19],...
> > >
> > > --
> > > Ran 505 tests in 2018.430s
> > >
> > >
> > > Is this one known?
> > >
> > > The other unit tests passed.
> > >
> > > On Thu, Jun 21, 2018 at 2:09 PM Haibin Lin 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Built from source with CUDA on Ubuntu.
> > > >
> > > > Ran example/gluon/word_language_model/train.py
> > > >
> > > > Best,
> > > > Haibin
> > > >
> > > >
> > > > On Thu, Jun 21, 2018 at 11:08 AM, Anirudh 
> > wrote:
> > > >
> > > > > Hi Pedro,
> > > > >
> > > > > I think you raised this issue in 1.2.0 release here:
> > > > > https://lists.apache.org/thread.html/
> ddc088a21aac179144350ea97353a7
> > > > > ea885b2765ccb98db08a03ba2d@%3Cdev.mxnet.apache.org%3E
> > > > > .
> > > > > I actually forgot about this issue during this release. Having said
> > > > that, I
> > > > > think since this works with make and the customers using cmake with
> > > > > USE_OPENMP=OFF should be considerably small we should not block the
> > > > release
> > > > > for this.
> > > > > The main reason we are doing this release is for this issue
> > > > > <https://github.com/apache/incubator-mxnet/issues/11091> . Now
> > pulling
> > > > > this
> > > > > change for the cmake fix would be also mean we need to pull 8 more
> > > > commits
> > > > > from dmlc-core and its considerable risk to introduce for the patch
> > > > > release.
> > > > > This would also mean cutting another rc. I think in the interest of
> > our
> > > > > customers who are eagerly waiting for the patch release to fix the
> > main
> > > > > issue, we should move ahead here.
> > > > > I missed reviewing all the known issue of 1.2.0 and add it to 1.2.1
> > > > release
> > > > > notes. I will do that now.
> > > > >
> > > > > Anirudh
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 21, 2018 at 10:42 AM, Pedro Larroy <
> > > > > pedro.larroy.li...@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > I think I have fixed this before, I will check if the patch
> didn't
> > > make
> > > > > it
> > > > > > to the branch.
> > > &

Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-21 Thread Anirudh
That looks like a flaky test because of atol being too small. please force
rtol and atol values and see if you are still able to reproduce.

On Thu, Jun 21, 2018 at 2:16 PM, Pedro Larroy 
wrote:

> Observed just one failure in OSX in test_imdecode:
>
>
> ==
> FAIL: test_imdecode (test_image.TestImage)
> --
> Traceback (most recent call last):
>   File
> "/Users/pllarroy/devel/mxnet/mxnet_release/tests/python/
> unittest/test_image.py",
> line 94, in test_imdecode
> assert_almost_equal(image.asnumpy(), cv_image)
>   File
> "/Users/pllarroy/devel/mxnet/mxnet_release/python/mxnet/test_utils.py",
> line 493, in assert_almost_equal
> raise AssertionError(msg)
> AssertionError:
> Items are not equal:
> Error 980.392157 exceeds tolerance rtol=0.10, atol=0.00.  Location
> of maximum error:(100, 457, 1), a=103.00, b=102.00
>  a: array([[[  0,  10,  20],
> [  0,  11,  19],
> [  2,  12,  19],...
>  b: array([[[  0,  10,  20],
> [  0,  11,  19],
> [  2,  12,  19],...
>
> --
> Ran 505 tests in 2018.430s
>
>
> Is this one known?
>
> The other unit tests passed.
>
> On Thu, Jun 21, 2018 at 2:09 PM Haibin Lin 
> wrote:
>
> > +1
> >
> > Built from source with CUDA on Ubuntu.
> >
> > Ran example/gluon/word_language_model/train.py
> >
> > Best,
> > Haibin
> >
> >
> > On Thu, Jun 21, 2018 at 11:08 AM, Anirudh  wrote:
> >
> > > Hi Pedro,
> > >
> > > I think you raised this issue in 1.2.0 release here:
> > > https://lists.apache.org/thread.html/ddc088a21aac179144350ea97353a7
> > > ea885b2765ccb98db08a03ba2d@%3Cdev.mxnet.apache.org%3E
> > > .
> > > I actually forgot about this issue during this release. Having said
> > that, I
> > > think since this works with make and the customers using cmake with
> > > USE_OPENMP=OFF should be considerably small we should not block the
> > release
> > > for this.
> > > The main reason we are doing this release is for this issue
> > > <https://github.com/apache/incubator-mxnet/issues/11091> . Now pulling
> > > this
> > > change for the cmake fix would be also mean we need to pull 8 more
> > commits
> > > from dmlc-core and its considerable risk to introduce for the patch
> > > release.
> > > This would also mean cutting another rc. I think in the interest of our
> > > customers who are eagerly waiting for the patch release to fix the main
> > > issue, we should move ahead here.
> > > I missed reviewing all the known issue of 1.2.0 and add it to 1.2.1
> > release
> > > notes. I will do that now.
> > >
> > > Anirudh
> > >
> > >
> > >
> > > On Thu, Jun 21, 2018 at 10:42 AM, Pedro Larroy <
> > > pedro.larroy.li...@gmail.com
> > > > wrote:
> > >
> > > > I think I have fixed this before, I will check if the patch didn't
> make
> > > it
> > > > to the branch.
> > > >
> > > > On Thu, Jun 21, 2018 at 10:24 AM Pedro Larroy <
> > > > pedro.larroy.li...@gmail.com>
> > > > wrote:
> > > >
> > > > > -1   I can't compile:
> > > > >
> > > > > 3rdparty/dmlc-core/libdmlc.a(io.cc.o): In function
> > > > > `std::thread::thread > > > InputSplitBase::Chunk>::Init(std::function > > > > (dmlc::io::InputSplitBase::Chunk**)>, std::function > > > > ()>)::{lambda()#1}&>(dmlc::ThreadedIter > > > InputSplitBase::Chunk>::Init(std::function > > > > (dmlc::io::InputSplitBase::Chunk**)>, std::function > > > > ()>)::{lambda()#1}&)':
> > > > > /usr/include/c++/5/thread:137: undefined reference to
> > `pthread_create'
> > > > > collect2: error: ld returned 1 exit status
> > > > > ninja: build stopped: subcommand failed.
> > > > >
> > > > >
> > > > > No LSB modules are available.
> > > > > Distributor ID: Ubuntu
> > > > > Description:Ubuntu 16.04.4 LTS
> > > > > Release:16.04
> > > > > Codename:   xenial
> > > > >
> > > > >
> > > > > My build script:
> > > > >
> > > > >
> >

Re: [VOTE] Release MXNet version 1.2.1.RC0 (Patch Release)

2018-06-21 Thread Anirudh
Hi Pedro,

I have seen this with the DEBUG flag on, not without it.
I opened an issue here some time back:
https://github.com/apache/incubator-mxnet/issues/10856

Anirudh

On Thu, Jun 21, 2018 at 12:16 PM, Pedro Larroy  wrote:

> Got it. Sorry to bring this up, and the deja-vu :-) . Makes sense not to
> consider this issue again as you suggested, thanks.
>
>
> I compiled in OSX and run unit tests. Also run SSD detector and is working
> fine.
> Compiled in ubuntu / cpu / debug with openmp. And run unit tests. I have
> the following failure
> in test_gluon_data:test_multi_worker_forked_data_loader
>
>
> test_gluon_data.test_image_folder_dataset ... ok
> Test should successfully run its course of multi-process/forked data loader
> without errors ... Assertion failure at kmp_runtime.cpp(6479):
> __kmp_thread_pool == __null.
> OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
> OMP: Hint: Please submit a bug report with this message, compile and run
> commands used, and machine configuration info including native compiler and
> operating system versions. Faster response will be obtained by including
> all program sources. For information on submitting this issue, please see
> https://bugs.llvm.org/.
> Assertion failure at kmp_runtime.cpp(6479): __kmp_thread_pool == __null.
> OMP: Error #13: Assertion failure at kmp_runtime.cpp(6479).
> OMP: Hint: Please submit a bug report with this message, compile and run
> commands used, and machine configuration info including native compiler and
> operating system versions. Faster response will be obtained by including
> all program sources. For information on submitting this issue, please see
> https://bugs.llvm.org/.
>
>
> Can be reproduced, anyone has seen this error before?  Is this a known
> issue?
>
> Pedro.
>
>
>
> On Thu, Jun 21, 2018 at 11:08 AM Anirudh  wrote:
>
> > Hi Pedro,
> >
> > I think you raised this issue in 1.2.0 release here:
> >
> > https://lists.apache.org/thread.html/ddc088a21aac179144350ea97353a7
> ea885b2765ccb98db08a03ba2d@%3Cdev.mxnet.apache.org%3E
> > .
> > I actually forgot about this issue during this release. Having said
> that, I
> > think since this works with make and the customers using cmake with
> > USE_OPENMP=OFF should be considerably small we should not block the
> release
> > for this.
> > The main reason we are doing this release is for this issue
> > <https://github.com/apache/incubator-mxnet/issues/11091> . Now pulling
> > this
> > change for the cmake fix would be also mean we need to pull 8 more
> commits
> > from dmlc-core and its considerable risk to introduce for the patch
> > release.
> > This would also mean cutting another rc. I think in the interest of our
> > customers who are eagerly waiting for the patch release to fix the main
> > issue, we should move ahead here.
> > I missed reviewing all the known issue of 1.2.0 and add it to 1.2.1
> release
> > notes. I will do that now.
> >
> > Anirudh
> >
> >
> >
> > On Thu, Jun 21, 2018 at 10:42 AM, Pedro Larroy <
> > pedro.larroy.li...@gmail.com
> > > wrote:
> >
> > > I think I have fixed this before, I will check if the patch didn't make
> > it
> > > to the branch.
> > >
> > > On Thu, Jun 21, 2018 at 10:24 AM Pedro Larroy <
> > > pedro.larroy.li...@gmail.com>
> > > wrote:
> > >
> > > > -1   I can't compile:
> > > >
> > > > 3rdparty/dmlc-core/libdmlc.a(io.cc.o): In function
> > > > `std::thread::thread > > InputSplitBase::Chunk>::Init(std::function > > > (dmlc::io::InputSplitBase::Chunk**)>, std::function > > > ()>)::{lambda()#1}&>(dmlc::ThreadedIter > > InputSplitBase::Chunk>::Init(std::function > > > (dmlc::io::InputSplitBase::Chunk**)>, std::function > > > ()>)::{lambda()#1}&)':
> > > > /usr/include/c++/5/thread:137: undefined reference to
> `pthread_create'
> > > > collect2: error: ld returned 1 exit status
> > > > ninja: build stopped: subcommand failed.
> > > >
> > > >
> > > > No LSB modules are available.
> > > > Distributor ID: Ubuntu
> > > > Description:Ubuntu 16.04.4 LTS
> > > > Release:16.04
> > > > Codename:   xenial
> > > >
> > > >
> > > > My build script:
> > > >
> > > >
> > > > #!/bin/bash
> > > > set -e
> > > > set -x
> > > >
> > > > renice -n 19 -p $$
> > > &g

Re: C++ api issue labeling

2018-06-21 Thread Anirudh
Hi Hagay,

I think we should keep these two labels seperate since they mean different
things.
The C++ label refers to the issue for MXNet backend and the CPP package
refers to the CPP language binding for mxnet.
We can still make C++ API great again irrespective by filtering out CPP
package issues :).

Anirudh


On Thu, Jun 21, 2018 at 11:56 AM, Hagay Lupesko  wrote:

> Hey community,
>
> I was going over the open GitHub issues for MXNet, and noticed that we have
> two labels for the CPP API: "CPP package", "C++"
>
> Wanted to suggest we remove "CPP package" and just stick to "C++"
> This will make it easier for the community to classify issues and focus on
> making the C++ API great again ;)
>
> Let me know if someone has any concerns, otherwise I will find a committer
> that I can work with to make this change.
>
> Thanks!
> Hagay
>


  1   2   >