Lunderberg commented on a change in pull request #9727:
URL: https://github.com/apache/tvm/pull/9727#discussion_r769831152



##########
File path: include/tvm/tir/buffer.h
##########
@@ -55,8 +55,48 @@ class BufferNode : public Object {
   Var data;
   /*! \brief data type in the content of the tensor */
   DataType dtype;
-  /*! \brief The shape of the buffer */
+  /*! \brief The shape of the buffer
+   *
+   * This contains the shape as it is accessed by
+   * BufferLoad/BufferStore nodes, and used by the low-level code
+   * generators.
+   */
   Array<PrimExpr> shape;
+  /*! \brief The shape of the buffer prior to flattening

Review comment:
       I agree, and I'd like to have a follow-up discussion at some point on 
this, as it shows up even at the TE level.  Up until the introduction of the 
`axis_separators`, there was only a single possible flattening of the buffer to 
a flat memory space.  With `axis_separators`, there's ambiguity on how a buffer 
should be flattened, which can result in different storage on the device.  I've 
sketched out a few possibilities at the TE level, but it should apply similarly 
when called from Relay, since the extra information to run a generalized 
device_copy would be needed regardless of the calling interface.
   
   Sketching out a few possibilities, as long as it's in mind.
   
   ```python
   # Current workflow
   dev = tvm.device(TARGET_NAME)
   arr = tvm.nd.array(arr_numpy, device=dev) # host->device copy occurs here
   
   schedule,tensors = make_schedule()
   func = tvm.build(schedule, tensors)
   
   func(arr)
   
   
   # Option 1:
   # - Add axis_separators argument to tvm.nd.array
   # - Con: User must know the axis_separators from prior knowledge.
   # - Con: Requires additional information from calling scope.
   dev = tvm.device(TARGET_NAME)
   arr = tvm.nd.array(arr_numpy, device=dev, axis_separators=[2]) # 
host->device copy occurs here
   
   schedule,tensors = make_schedule()
   func = tvm.build(schedule, tensors)
   
   func(arr)
   
   # Option 2:
   # - Add for_use_as argument to tvm.nd.array
   # - Con: Requires user to define additional argument.
   dev = tvm.device(TARGET_NAME)
   
   schedule,tensors = make_schedule()
   func = tvm.build(schedule, tensors)
   arr = tvm.nd.array(arr_numpy, device=dev, for_use_as=func.args[0]) # 
host->device copy occurs here
   
   func(arr)
   
   # Option 3:
   # - Delay the host->device data transfer until the use case is known.
   # - Pro: Doesn't require API change from calling scope.
   # - Con: Does require caching data on the host, in case arr_numpy is
   #   changed between the tvm.nd.array and func calls.
   dev = tvm.device(TARGET_NAME)
   
   schedule,tensors = make_schedule()
   func = tvm.build(schedule, tensors)
   arr = tvm.nd.array(arr_numpy, device=dev) # Cache arr_numpy on host for 
later use.
   
   func(arr) # host->device copy occurs here
   
   # Option 4:
   
   # - Delay the host->device data transfer until the use case is known,
   #   but only for devices that have non-flat memory regions
   # - Pro: Doesn't require API change from calling scope.
   # - Pro: Doesn't require caching data on the host in all cases.
   # - Con: Adds additional complexity
   dev = tvm.device(TARGET_NAME)
   
   schedule,tensors = make_schedule()
   func = tvm.build(schedule, tensors)
   # For devices that support multi-dimensional memory, arr_numpy is
   # cached here.  For other devices, host->device copy occurs here.
   arr = tvm.nd.array(arr_numpy, device=dev) 
   
   # For devices that support multi-dimensional memory, host->device copy
   # occurs here.
   func(arr)
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to