Lunderberg opened a new pull request, #16992:
URL: https://github.com/apache/tvm/pull/16992

   Prior to this commit, using `disco.Session` methods to transfer `NDArray` 
instances to workers could raise an exception if the `NDArray` is larger than 
the buffer allocated by the OS for the controller/worker pipe.  In these case, 
the first call to the `Read` method of `tvm::support::Pipe` would successfully 
return, but only with the initial bytes of the `NDArray`.  Receiving the full 
`NDArray` requires repeatedly calling the POSIX `read` function.
   
   This commit updates the `Read` and `Write` methods of `tvm::support::Pipe` 
to repeatedly call the underlying read/write methods, until the full `NDArray` 
has been transferred.
   
   This commit does not add any unit tests, as the existing unit test 
`tests/python/disco/test_ccl.py::test_attention[nccl-ProcessSession]` requires 
this change to pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to