[apache/incubator-mxnet] [RFC] Gluon Based C++ Frontend (#20257)

barry-jin Tue, 11 May 2021 10:41:19 -0700

## Problem statement
Currently, migration of cpp-package to MXNet2.0 is [in 
progress](https://github.com/apache/incubator-mxnet/pull/20131). The changes in 
MXNet2.0 cpp-package include some tasks like renaming some CAPIs with Ex 
suffix, adopting CachedOp interface in executor forward computing and autograd 
interface in executor backward propogation. Which means, the v1.x cpp-package 
users can migrate to v2.0 to still utilize executor to do training or 
inferencing on their models.


However, there are two limitations:
1. The cross-compilation problem is not resolved at all. In issue 
https://github.com/apache/incubator-mxnet/issues/13303 and issue 
https://github.com/apache/incubator-mxnet/issues/20222, users are still facing 
the issue from using 
[OpWrapperGenerator.py](https://github.com/apache/incubator-mxnet/blob/v1.8.x/cpp-package/scripts/OpWrapperGenerator.py)
 to open a target binary on the host machine. 
2. The users for cpp-package are still interacting with symbols, which lacks 
certain flexibility in model construction. 

## Proposed New C++ Frontend
Based on the above limitations, I propose to adopt the design of gluon api to 
build a new version of cpp-package with both flexibility and performance 
benefits. The design logic is highly similar to current gluon2.0 with tracing 
and deferred compute. I have been building a simple gluon based 
[cpp-package](https://github.com/barry-jin/incubator-mxnet/tree/cpp2/cpp-package)
 and a [simple 
demo](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/example/simple_demo.cpp)
 to create an end-to-end training process. The following is the high level 
overview of some APIs. 

### 
[Operator](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/operator_rt.h)
New operator is relying on packed function's runtime api(this is a runtime api, 
probably will solve the limitation1). Firstly, an [operator 
map](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/op_rt_map.h#L48-L69)
 will be created by mapping from the operator's string name to its function 
handle. Then user only need to use this 
[macro](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/operator_rt.h#L41-L52)
 to create the operator wrapper in the `mxnet::cpp::op` namespace. Inside the 
operator, the [type conversion and 
translation](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/operator_rt.h#L89-L165)
 happens. It's similar to the 
[function.py](https://github.com/apache/incubator-mxnet/blob/970a2cfbe77d09ee610fdd70afca1a93247cf4fb/python/mxnet/_ffi/_ctypes/function.py#L56-L96),
 which erase the type information for each argument and wrapped into 
MXNetValue, passed to C++ backend. 

### 
[Block](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/block.h)
MXNet C++ block will be the base class of all the basic layers, which is the 
same as python. It will use 
[register_block](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/block.h#L70-L71)
 to store the pointer to its children in a 
[vector](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/block.h#L350-L351).
 [Tracing and deferred 
compute](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/block.h#L312-L340)
 is used to create the graph. 

### 
[basic_layers](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/nn/basic_layers.h)
Currently, I have only Dense, Activation and Dropout layer for demo purpose. 
Parameters for dense layer will be registered with 
[register_parameter](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/nn/basic_layers.hpp#L53-L54)
 method. 

### 
[autograd](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/autograd.h)
Since there is a lack of context manager mechanism in C++, we have to use 
[`start_recording()`](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/autograd.h)
 and 
[`finish_recording()`](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/autograd.h#L76-L83)
 to define the autograd recording scope. Also, the 
[backward](https://github.com/barry-jin/incubator-mxnet/blob/07d3a73eaa7443ed1b211d6788515f275608350c/cpp-package/include/mxnet-cpp/autograd.h#L90-L96)
 method is a simple wrapper around the CAPI. 

### Others
There are some other simple modules, like 
[Trainer](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/trainer.h)
 and 
[loss](https://github.com/barry-jin/incubator-mxnet/blob/cpp2/cpp-package/include/mxnet-cpp/loss.h)

## Example
### A simple example
```C++
#include <chrono>
#include <string>
#include "utils.h"
#include "mxnet-cpp/MxNetCpp.h"

using namespace mxnet::cpp;

// Define a new Model derived from Block
class Model : public gluon::Block {
 public:
  Model() {
    // Register three dense blocks and one dropout block(need user to provide 
shape)
    dense0 = register_block("dense0", gluon::nn::Dense(64, 784, "relu"));
    dropout0 = register_block("dropout0", gluon::nn::Dropout(0.5));
    dense1 = register_block("dense1", gluon::nn::Dense(32, 64, "sigmoid"));
    dense2 = register_block("dense2", gluon::nn::Dense(10, 32));
  }

  // Model's forward algorithm
  NDArray forward(NDArray x) {
    x = dense0(x);
    x = dropout0(x);
    x = dense1(x);
    x = dense2(x);
    return x;
  }

  gluon::nn::Dense dense0, dense1, dense2;
  gluon::nn::Dropout dropout0;
};

int main(int argc, char** argv) {
  // Create a new model
  Model model;
  // Use Uniform Initializer to initialize the model's parameters with scale of 
0.07
  model.initialize<Uniform>(Uniform(0.07));
  // Use legacy cpp-package's dataloader
  int batch_size = 32;
  std::vector<std::string> data_files = { 
"./data/mnist_data/train-images-idx3-ubyte",
                                          
"./data/mnist_data/train-labels-idx1-ubyte",
                                          
"./data/mnist_data/t10k-images-idx3-ubyte",
                                          
"./data/mnist_data/t10k-labels-idx1-ubyte"
                                        };
  auto train_iter = MXDataIter("MNISTIter");
  if (!setDataIter(&train_iter, "Train", data_files, batch_size)) {
    return 1;
  }
  // Define autograd
  AutoGrad ag(true, true);
  // Define trainer with sgd optimizer and 0.5 learning_rate
  std::unordered_map<std::string, double> opt_params;
  opt_params["lr"] = 0.5;
  Trainer trainer(model.collect_parameters(), "sgd", opt_params);
  // Define loss function
  gluon::SoftmaxCrossEntropyLoss loss_fn;

  for (size_t epoch = 1; epoch <= 100; ++epoch) {
    size_t batch_index = 0;
    // Reset train iter
    train_iter.Reset();
    // Iterate the dataloader
    while (train_iter.Next()) {
      // Get data batch
      auto batch = train_iter.GetDataBatch();
      // Start Autograd recording
      ag.start_recording();
      // Apply model on the input data
      NDArray pred = model(batch.data);
      // Compute loss value 
      NDArray loss = loss_fn(pred, batch.label);
      // Backward propagation on the loss
      ag.backward(loss);
      // Finish Autograd recording
      ag.finish_recording();
      // Update parameters with trainer
      trainer.step();
      NDArray::WaitAll();
      if (++batch_index % 100 == 0) {
        std::cout << "Epoch: " << epoch << " | Batch: " << batch_index
                  << " | Loss: " << loss.item<float>() << std::endl;
      }
    }
  }
}
```

## References
[1] 
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/block.py


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/20257

[apache/incubator-mxnet] [RFC] Gluon Based C++ Frontend (#20257)

Reply via email to