mbs-octoml commented on a change in pull request #48:
URL: https://github.com/apache/tvm-rfcs/pull/48#discussion_r787217701



##########
File path: rfcs/0048-BYOC-Marvell-ML-accelerator-integration.md
##########
@@ -0,0 +1,547 @@
+- Feature Name: (fill me in with a unique identifier, `my_awesome_feature`)
+- Start Date: (fill me in with today's date, YYYY-MM-DD)
+- RFC PR: [apache/tvm-rfcs#0000](https://github.com/apache/tvm-rfcs/pull/0000)
+- GitHub Issue: [apache/tvm#0000](https://github.com/apache/tvm/issues/0000)
+- GitHub pre-RFC PR: 
[apache/tvm-PR-9730](https://github.com/apache/tvm/pull/9730)
+- GitHub pre-RFC discussion: 
[BYOC-Marvell](https://discuss.tvm.apache.org/t/pre-rfc-byoc-marvell-ml-ai-accelerator-integration/11691)
+
+# Summary
+[summary]: #summary
+
+Integrate Marvell’s ML/AI accelerator with TVM BYOC framework in order to 
bring the TVM ecosystem to Marvell customers.
+
+# Motivation
+[motivation]: #motivation
+
+Marvell MLIP is an ML/AI inference accelerator and is embedded on our ARM 
Neoverse N2-based OCTEON 10 processor.
+  We are building an easy-to-use, open, software suite for our customers by 
integrating and utilizing TVM so that
+  we can bring TVM capability and experience to our customers.
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+Based on what Marvell ML/AI inference accelerator does the best, a given 
pre-trained network model
+will be applied to a TVM-Mrvl-BYOC AOT compilation and code-gen flow as 
illustrated in steps below.
+
+STEP (1) Run TVM-Mrvl-BYOC AOT ML Frontend Compilation and Mrvl-BYOC code-gen. 
The steps involved in this are:
+
+* Load pre-trained network into TVM IR graph
+
+* Do Marvell-specific layout conversions to transform IR graph in order to 
meet requirements of the accelerator
+
+* Do Marvell-specific composite-merging/fusing to transform IR graph in order 
to utilize available HW capability

Review comment:
       Hi, thanks for the RFC. My team at OctoML is looking at bringing some 
training features to the BYOC world (a la 
https://arxiv.org/pdf/2111.00655.pdf), so I'm looking at this RFC with that 
future in mind. Can you expand on:
    - Is the fusion using the existing MergeComposite / AnnotateTarget/ 
MergeCompilerRegions(maybe) / PartitionGraph sequence?
    - other than the global layout xform, which necessarily must be done before 
any fusion etc, are there any other xforms before the above partitioning takes 
place?
    - can you explain the need to limit to one kernel for each of your byoc and 
the default tvm? Perhaps it's an artifact of how you're later trying to capture 
the byoc output in json graph form? Ideally the BYOC target.ext.<your name> 
function could be run multiple times, the resulting runtime::Module would be 
accumulated in the IRModule, and the runtime::Modules later merged. Perhaps 
supporting that would actually be easier and would remove the at-most-one 
kernel limit?
    - Ideally there'd be a single entry point for 'partition for marvel', after 
which the regular TVM build would deal with fusion, lowering and codegen for 
everything that's left (ie overall model - kernels you already partitioned 
out). I may not be following the explanation but it seems you're proposing the 
driver splits things more explicitly.
    - Like @areusch  I'm a bit confused by the special handling of the graph. 
Perhaps it would be worth going through the tensorrt BYOC integration as a 
reference example since it too collects a JSON representation of the 
to-be-complied fused sub-graph (we invoke the TensorRT build function at 
runtime not compile time), but it does so on top of existing machinery. 
   
   Let me know if it would be easier to discuss this on a PR rather than here, 
then we could come back to here.   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to