tqchen commented on a change in pull request #4602: [Docs] Bring Your Own 
Codegen Guide -- Part 1
URL: https://github.com/apache/incubator-tvm/pull/4602#discussion_r366616774
 
 

 ##########
 File path: docs/dev/relay_bring_your_own_codegen.rst
 ##########
 @@ -0,0 +1,529 @@
+..  Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+..    http://www.apache.org/licenses/LICENSE-2.0
+
+..  Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+=============================
+Bring Your Own Codegen To TVM
+=============================
+**Author**: `Zhi Chen <https://github.com/zhiics>`_, `Cody Hao Yu 
<https:://github.com/comaniac>`_
+
+As the number of hardware devices targeted by deep learning workloads keeps 
increasing, the required knowledge for users to achieve high performance on 
various devices keeps increasing as well. To free data scientists from worrying 
about the performance when developing a new model, hardware vendors either 
provide libraries such as MKLDNN or cuDNN with many commonly used deep learning 
operators, or provide frameworks such as TensorRT to let users describe their 
models in a certain way to achieve high performance. However, users have to 
learn a new programming interface when they attempt to work on a new library or 
device. As a result, the demand for a unified programming interface becomes 
more and more important to 1) let all users and hardware vendors stand on the 
same page, and 2) provide a feasible solution to allow specialized hardware or 
library to only support widely used operators with extremely high performance, 
but fallback unsupported operators to general devices like CPU/GPU.
+
+In this developer guide, we demonstrate how you, as a hardware vendor, can 
easily implement your own codegen and register it as a Relay backend compiler 
to support your hardware device/library. This guide covers two types of codegen 
based on different graph representations you need:
+
+**1. You want to generate C code.**
+
+If your hardware already has a well-optimized C/C++ library, such as Intel 
CBLAS/MKL to CPU and NVIDIA CUBLAS to GPU, then this is what you are looking 
for. Fortunately, C source code module is fully compatible with TVM runtime 
module, which means the generated code could be compiled by any C/C++ compiler 
with proper compilation flags, so the only task you have is to implement a 
codegen that generates C code for subgraphs and a C source module to integrate 
into TVM runtime module. We will demonstrate how to implement a C code 
generator for your hardware in the following section.
 
 Review comment:
   The API will looks like:
   ```c++
   void json_rt_(DLTensor* x, DLTensor *y, DLTensor* out) {
      // followup code
   } 
   
   TVM_DLL_EXPORT_TYPED(json_rt, json_rt_); 
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to