Re: [apache/incubator-mxnet] [RFC] MXNet 2.0 JVM Language development (#17783)

2020-07-31 Thread Samuel Audet
> We are looking for a robust solution for MXNet Java developers to use 
> especially owned and maintained by the Apache MXNet's community. I will be 
> more than happy to see if you would like to contribute the source code that 
> generate MXNet JavaCpp package to this repo. So we can own the maintainance 
> and responsible for the end users that the package is reliable.
> 
> At the beginning, we were discussing several ways that we can try to preserve 
> a low level Java API for MXNet that anyone who use Java can start with. Most 
> of the problems were lying under the ownership and maintainance part. I have 
> placed JavaCpp option to option 5 so we can see which one works the best in 
> the end.

Sounds good, thanks! If you have any specific concerns about the above, please 
let me know. JNA seems to be maintained by a single person with apparently no 
connections to the AI industry 
(https://dzone.com/articles/scratch-netbeans-itch-matthias) whereas I have to 
maintain anyway as part of my work APIs mainly for OpenCV, FFmpeg, ONNX 
Runtime, and TensorFlow at the moment, but others as well and it tends to vary 
with time, MXNet could become part of those eventually, and I have users paying 
for commercial support of proprietary libraries too, so I think JavaCPP is the 
better option here, but I'm obviously biased. :)

-- 
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/17783#issuecomment-667041968

Re: [apache/incubator-mxnet] [RFC] MXNet 2.0 JVM Language development (#17783)

2020-07-25 Thread Samuel Audet
> ## What's missing
> 
> javacpp-presets-mxnet doesn't expose APIs form nnvm/c_api.h (some of current 
> python/gluon API depends on APIs in nnvm/c_api.h)

I've added that the other day, thanks to @frankfliu for pointing this out: 
https://github.com/bytedeco/javacpp-presets/commit/976e6f7d307b3f3855f39413c494d8f482c9adf6

> See javadoc: http://bytedeco.org/javacpp-presets/mxnet/apidocs/
> 
> 1. Java class name is “mxnet”, which is not following java naming conventions

That's not hardcoded. We can use whatever name we want for that class.

> 2. Each pointer has a corresponding java class, which is arguable. It's 
> necessary to expose them as strong type class if they meant to be used 
> directly by end developer. But they really should only be internal 
> implementation of the API. It's overkill to expose them as a Type instead of 
> just a pointer.

We can map everything to `Pointer`, that's not a problem either.

> 3. All the classes (except mxnet.java) are hand written.

No, they are not. Everything in the `src/gen` directory here is generated at 
build time:
https://github.com/bytedeco/javacpp-presets/tree/master/mxnet/src/gen/java/org/bytedeco/mxnet

> 4. API mapping are hand coded as well.

If you're talking about this file, yes, that's the only thing that is written 
manually:
https://github.com/bytedeco/javacpp-presets/blob/master/mxnet/src/main/java/org/bytedeco/mxnet/presets/mxnet.java
(The formatting is a bit crappy, I haven't touched it in a while, but we can 
make it look prettier like this:
https://github.com/bytedeco/javacpp-presets/blob/master/onnxruntime/src/main/java/org/bytedeco/onnxruntime/presets/onnxruntime.java
 )

> ## Performance
> 
> JavaCPP native library load takes a long time, it takes average _2.6 seconds_ 
> to initialize libmxnet.so with javacpp.
> 
> Loader.load(org.bytedeco.mxnet.global.mxnet.class);

Something's wrong, that takes less than 500 ms on my laptop, and that includes 
loading OpenBLAS, OpenCV, and a lookup for CUDA and MKL, which can obviously be 
optimized... In any case, we can debug that later to see what is going wrong on 
your end.

> ## Issues
> 
> The open source code on github doesn't match the binary release on maven 
> central:
> 
> * the maven group and the java package name are different.

Both the group ID and the package names are `org.bytedeco`, but in any case, if 
that gets maintained somewhere here, I imagine it would be changed to something 
like `org.apache.mxnet.xyz.internal.etc`

> * c predict API is not included in maven version

Yes it is: 
http://bytedeco.org/javacpp-presets/mxnet/apidocs/org/bytedeco/mxnet/global/mxnet.html
 
> * Example code doesn't work with maven artifacts, it can only build with 
> snapshot version locally.

https://github.com/bytedeco/javacpp-presets/tree/master/mxnet/samples works 
fine for me on Linux:
```
$ mvn -U clean compile exec:java -Djavacpp.platform.custom 
-Djavacpp.platform.host -Dexec.args=apple.jpg
...
Downloading from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/mxnet-platform/1.7.0.rc1-1.5.4-SNAPSHOT/maven-metadata.xml
Downloaded from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/mxnet-platform/1.7.0.rc1-1.5.4-SNAPSHOT/maven-metadata.xml
 (1.3 kB at 2.5 kB/s)
Downloading from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/mxnet-platform/1.7.0.rc1-1.5.4-SNAPSHOT/mxnet-platform-1.7.0.rc1-1.5.4-20200725.115300-20.pom
Downloaded from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/mxnet-platform/1.7.0.rc1-1.5.4-SNAPSHOT/mxnet-platform-1.7.0.rc1-1.5.4-20200725.115300-20.pom
 (4.7 kB at 9.3 kB/s)
Downloading from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/javacpp-presets/1.5.4-SNAPSHOT/maven-metadata.xml
Downloaded from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/javacpp-presets/1.5.4-SNAPSHOT/maven-metadata.xml
 (610 B at 1.5 kB/s)
Downloading from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/javacpp-presets/1.5.4-SNAPSHOT/javacpp-presets-1.5.4-20200725.155410-6590.pom
Downloaded from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/javacpp-presets/1.5.4-SNAPSHOT/javacpp-presets-1.5.4-20200725.155410-6590.pom
 (84 kB at 91 kB/s)
Downloading from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/opencv-platform/4.4.0-1.5.4-SNAPSHOT/maven-metadata.xml
Downloaded from sonatype-nexus-snapshots: 
https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/opencv-platform/4.4.0-1.5.4-SNAPSHOT/maven-metadata.xml
 (1.2 kB at 2.6 kB/s)
Downloading from sonatype-nexus-snapshots: 

Re: [apache/incubator-mxnet] [RFC] MXNet 2.0 JVM Language development (#17783)

2020-07-25 Thread Samuel Audet
> @saudet Thanks for your proposal. I have four questions would like to ask you:
> 
> 1. If we adopt JavaCpp package, how will that be consumed? Under byteco or 
> apache MXNet? Essentially from our previous discussion, we really don't want 
> another 3rdparty checkin.

We can go either way, but I found that projects like MXNet or TensorFlow that 
need to develop high-level APIs on top of something like JavaCPP prefer to have 
control over everything in their own repositories, and use JavaCPP pretty much 
like we would use pybind and pip for Python.

I started the JavaCPP Presets because for projects such as OpenCV, FFmpeg, 
LLVM, etc, high-level APIs for other languages than C/C++ are not being 
developed as part of those projects. I also realized the Java community needed 
something like Anaconda...

> 2. Can you also do a benchmark on the MXNet's API's performance and possibly 
> share the reproducible code? We did test the performance on JavaCpp vs JNA vs 
> JNI and didn't see much difference on performance (under 10%).
> 
> 
> * MXImperativeInvokeEx
> 
> * CachedOpForward
> 
> 
> The above two methods are most frequently used methods in order to do minimum 
> inference request, please try on these two to see how performance goes.
> 

If you're doing only batch operations, as would be the case for Python 
bindings, you're not going to see much difference, no. What you need to look at 
are things like the Indexer package, which allows us to implement fast custom 
operations in Java like this: http://bytedeco.org/news/2014/12/23/third-release/
You're not going to be able to do that with JNA or JNI without essentially 
recoding that kind of thing.

> 3. We do have some additional technical issue with JavaCpp, is there any plan 
> to fix it? (I will put it into a separate comment since it is really big.
> 
> 4. How do you ensure the performance if the build flag is different? Like the 
> mxnet has to build from source (with necessary modification on source code) 
> in order to work along with javacpp
> 
> 5. regarding to the dependencies issue, can we go without additional opencv 
> and openblas in the package?

Yes, that's the kind of issues that would be best dealt with by using only 
JavaCPP as a low-level tool, instead of the presets, which is basically a 
high-level distribution like Anaconda.

-- 
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/17783#issuecomment-663916338

Re: [apache/incubator-mxnet] [RFC] MXNet 2.0 JVM Language development (#17783)

2020-07-23 Thread Samuel Audet
Hi, instead of JNA, I would be happy to provide bindings for the C API and 
maintain packages based on the JavaCPP Presets here:
https://github.com/bytedeco/javacpp-presets/tree/master/mxnet
JavaCPP adds no overhead, unlike JNA, and is often faster than manually written 
JNI. Plus JavaCPP provides more tools than JNA to automate the process of 
parsing header files as well as packaging native libraries in JAR files. I have 
been maintaining modules for TensorFlow based on JavaCPP, and we actually got a 
boost in performance when compared to the original JNI code:
https://github.com/tensorflow/java/pull/18#issuecomment-579600568
I would be able to do the same for MXNet and maintain the result in a 
repository of your choice. Let me know if this sounds interesting! BTW, the 
developers of DJL also seem opened to switch from JNA to JavaCPP even though it 
is not a huge priority. Still, standardizing how native bindings are created 
and loaded with other libraries for which JavaCPP is pretty much already the 
standard (such as OpenCV, TensorFlow, CUDA, FFmpeg, LLVM, Tesseract) could go a 
long way in alleviating concerns of stability.

-- 
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/17783#issuecomment-662994965