ccjoechou commented on a change in pull request #48: URL: https://github.com/apache/tvm-rfcs/pull/48#discussion_r787262754
########## File path: rfcs/0048-BYOC-Marvell-ML-accelerator-integration.md ########## @@ -0,0 +1,547 @@ +- Feature Name: (fill me in with a unique identifier, `my_awesome_feature`) +- Start Date: (fill me in with today's date, YYYY-MM-DD) +- RFC PR: [apache/tvm-rfcs#0000](https://github.com/apache/tvm-rfcs/pull/0000) +- GitHub Issue: [apache/tvm#0000](https://github.com/apache/tvm/issues/0000) +- GitHub pre-RFC PR: [apache/tvm-PR-9730](https://github.com/apache/tvm/pull/9730) +- GitHub pre-RFC discussion: [BYOC-Marvell](https://discuss.tvm.apache.org/t/pre-rfc-byoc-marvell-ml-ai-accelerator-integration/11691) + +# Summary +[summary]: #summary + +Integrate Marvell’s ML/AI accelerator with TVM BYOC framework in order to bring the TVM ecosystem to Marvell customers. + +# Motivation +[motivation]: #motivation + +Marvell MLIP is an ML/AI inference accelerator and is embedded on our ARM Neoverse N2-based OCTEON 10 processor. + We are building an easy-to-use, open, software suite for our customers by integrating and utilizing TVM so that + we can bring TVM capability and experience to our customers. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Based on what Marvell ML/AI inference accelerator does the best, a given pre-trained network model +will be applied to a TVM-Mrvl-BYOC AOT compilation and code-gen flow as illustrated in steps below. + +STEP (1) Run TVM-Mrvl-BYOC AOT ML Frontend Compilation and Mrvl-BYOC code-gen. The steps involved in this are: + +* Load pre-trained network into TVM IR graph + +* Do Marvell-specific layout conversions to transform IR graph in order to meet requirements of the accelerator + +* Do Marvell-specific composite-merging/fusing to transform IR graph in order to utilize available HW capability Review comment: Let me raise a difference here: * The TVM partition’s sub-graph seems to represent a relay function, which can include multiple frontend operators captured by utilizing the relay merge-composite pattern * The Marvell sub-graph is a connected graph of multiple relay merge-composite functions – I did not know how to include a Figure in the RFC file before (now I do). But if you look at the listed pre-RFC link, we did include figures at end of the corresponding pre-RFC on the discuss forum – please check the end of pre-RFC and its figure to see whether they can help explaining the definition of Marvell sub-graphs here. https://discuss.tvm.apache.org/t/pre-rfc-byoc-marvell-ml-ai-accelerator-integration/11691]. We have also up-steamed the TVM GitHub’s PR-9730 as a POC (can be downloaded via git clone https://github.com/ccjoechou/tvm.git and changes are on the byoc-mrvl branch). Please see the tvm/python/tvm/relay/op/contrib/mrvl.py file's partition_for_mrvl() function's seq setup there. There is also the test_mrvl suite, which can be run to generate JSON files for ssd-resnet50 network. [Using our definition of sub-graph -- not the TVM partition's definition of sub-graph] Yes, limitation regarding at-most one mrvl-sub-graph and at most-one llvm sub-graph can be relaxed later on when we have runtime & driver hookups ready + our driver & firmware of our HW accelerator are also ready to handle multiple sub-graphs. We will be spending time on this area in the next few months. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org