(tvm-rfcs) branch main updated: [RFC] Add NNEF frontend (#108)

tqchen Thu, 30 May 2024 17:29:36 -0700

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm-rfcs.git



The following commit(s) were added to refs/heads/main by this push:
     new f0f982f  [RFC] Add NNEF frontend (#108)
f0f982f is described below

commit f0f982f2bf8168b5953f0193610c0aea977c75a8
Author: Czobor Ágoston Mátyás <73029973+agoston...@users.noreply.github.com>
AuthorDate: Fri May 31 02:29:27 2024 +0200

    [RFC] Add NNEF frontend (#108)
    
    * [RFC] Add NNEF frontend (#108)
    
    * update md
    
    * Add Relax to RFC
---
 rfcs/0108-add-nnef-frontend.md | 132 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 132 insertions(+)

diff --git a/rfcs/0108-add-nnef-frontend.md b/rfcs/0108-add-nnef-frontend.md
new file mode 100644
index 0000000..db7aebc
--- /dev/null
+++ b/rfcs/0108-add-nnef-frontend.md
@@ -0,0 +1,132 @@
+- Feature Name: `NNEF frontend to Relay and Relax`
+- Start Date: 2024-04-11
+- RFC PR: [apache/tvm-rfcs#0108](https://github.com/apache/tvm-rfcs/pull/0108)
+- GitHub Issue: [apache/tvm#0000](https://github.com/apache/tvm/issues/0000)
+
+# Summary
+[summary]: #summary
+
+Add the Khronos Neural Network Exchange Format (NNEF) as a frontend to TVM 
Relay and Relax.
+
+# Motivation
+[motivation]: #motivation
+
+NNEF is an open, standardized format for neural network exchange developed by 
the Khronos Group since 2018 (https://www.khronos.org/nnef). It is aimed at 
deploying trained neural networks from deep learning frameworks to proprietary 
inference engines of neural network hardware vendors. Such inference engines 
often require an offline compilation step for running models more efficiently, 
hence hardware vendors are are looing into open source compiler stacks to be 
leveraged. On one hand, ha [...]
+
+The Khronos Group also maintains a set of tools for handling NNEF models. 
Since NNEF is mainly a textual format, these include a parser (with C++ and 
Python interfaces), and conversion tools from other formats. NNEF supports 
conversion from models of various deep learning frameworks, including Caffe, 
TensorFlow (also Lite) and all those that support ONNX, such as PyTorch. 
Creating NNEF models is also possible manually by directly writing the model 
text file(s) (since NNEF is similar to a [...]
+
+For example, loading an NNEF model in Python is as simple as follows:
+
+```python
+import nnef
+graph = nnef.load_graph('example.nnef')
+```
+
+The resulting graph object, containing tensors and operators can then be 
traversed and processed, for example converted into TVM representation, as done 
in this PR.
+
+The NNEF tools also provide a simple C++ based reference implementation for 
NNEF models, whose main purpose is testing/debugging conversions, and serving 
as a reference for other more efficient inference backends. Furthermore, a 
PyTorch based interpreter is also supported, which is able to execute NNEF 
models via on/the-fly conversion to PyTorch calls, and can also be used as a 
(more efficient) reference.
+
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+We are going to add support for models in NNEF format. The model may be 
provided either as an NNEF model folder, or an `nnef.Graph` object 
+already loaded into memory.
+The conversion is done via the new frontend function
+```python
+# for relay frontend:
+import tvm.relay as relay
+mod, params = relay.frontend.from_nnef(model, freeze_vars=False)
+```
+- model: either a string / PathLike to an NNEF model folder, or an 
`nnef.Graph` object.
+- freeze_vars: bool (optional), which sets whether the parameters should be 
considered variables or constants for optimization.
+
+```python
+# for relax frontend:
+import tvm.relax as relax
+import tvm.relax.frontend.nnef
+mod = relax.frontend.nnef.from_nnef(model, keep_params_in_input=False)
+```
+- model: either a string / PathLike to an NNEF model folder, or an 
`nnef.Graph` object.
+- keep_params_in_input: bool (optional), sets whether the nnef variables will 
be converted to constants and folded into the model, or need to be given as 
inputs.
+
+
+Example usages (assuming we have a valid NNEF model)
+```python
+import nnef
+from tvm import relay
+
+model_path = 'path/to/model.nnef'
+
+# If modification is warranted the graph can be imported with 
`nnef.load_graph` 
+graph = nnef.load_graph(model_path)
+
+mod, params = relay.frontend.from_nnef(graph)
+
+# Or the converter can read the graph from path as well
+
+mod, params = relay.frontend.from_nnef(model_path)
+
+```
+
+
+```python
+import tvm.relax as relax
+import tvm.relax.frontend.nnef
+
+model_path = 'path/to/model.nnef'
+
+# If modification is warranted the graph can be imported with 
`nnef.load_graph` 
+graph = nnef.load_graph(model_path)
+
+mod = relax.frontend.nnef.from_nnef(graph)
+
+# Or the converter can read the graph from path as well
+mod = relax.frontend.nnef.from_nnef(model_path)
+```
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+As this RFC only adds a new frontend, no other features should be affected. 
+
+The process of importing an NNEF model consists of:
+
+- Loading an NNEF model into memory, if a model path is provided, using 
`nnef.load_graph` function to get an `nnef.Graph` object.
+After this step the model may be modified with functions provided for NNEF 
models before final conversion to TVM.
+- Converting the operations of the Graph, setting inputs, and reading 
parameters one by one.
+
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Potential increase in time-cost of unit tests.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+The frontend of NNEF is similar to that of ONNX, PyTorch, and TensorFlow, 
adding it would increase the number of model formats that TVM can process.
+
+# Prior art
+[prior-art]: #prior-art
+
+We are aware of the following projects that currently support importing NNEF 
models:
+
+- https://aimotive.com/aiware
+- https://github.com/sonos/tract
+- https://github.com/fragata-ai/arhat-nnef
+- 
https://rocm.docs.amd.com/projects/MIVisionX/en/latest/model_compiler/README.html
+- https://www.khronos.org/openvx/
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+- Whether test cases can make use of pre-written the NNEF models, (text files 
with NNEF syntax, such as `graph.nnef`) as a starting point. Currently our test 
cases use separate model folders with prewritten model definitions, and we only 
generate the inputs for those. The 'tests/python/frontend/nnef/models' folder 
contains these test cases.
+- Installation of NNEF and NNEF-Tools to the TVM CI Docker images. We need the 
Docker images to contain an install script which uses git to add NNEF to the CI 
environment, also with lint exceptions to `.nnef` files (mentioned in the 
previous point). It seems to work when the docker images are rebuilt from 
source with the install scripts added, but we are not sure if it okay.
+
+# Future possibilities
+[future-possibilities]: #future-possibilities
+
+The Khronos Groups is actively working on the next major update to the NNEF 
format, whose main purpose is to increase model coverage by adding support for 
dynamic models and custom operators. In the latter case, more involved 
compilation of models carries even more potential, so we plan to add support 
for the next generation as well.
+
+Support for some NNEF operators would only be possible through more complex 
mapping to a sequence of TVM operators, and the less widely used ones were not 
the focus of this initial release. We may add support to such operators in the 
future if required.

(tvm-rfcs) branch main updated: [RFC] Add NNEF frontend (#108)

Reply via email to