[GitHub] [incubator-mxnet] leezu commented on issue #18716: [RFC] Use TVMOp with GPU & Build without libcuda.so in CI

GitBox Wed, 15 Jul 2020 08:50:24 -0700


leezu commented on issue #18716:
URL: 
https://github.com/apache/incubator-mxnet/issues/18716#issuecomment-658846227



   > Violates the effort of removing libcuda.so totally, (would be great if 
someone can elaborate the motivation behind it).
   
   Many customers use a single mxnet build that supports gpu features and 
deploy it to both gpu and cpu machines. Due to the way how cuda containers are 
designed, libcuda.so won't be present on the cpu machines. That's why it's 
better to dlopen(cuda) only once needed. This not only affects tvmop but als 
nvrtc feature in mxnet.
   
   Using the stubs is a workaround for using dlopen, but adds additional 
requirements for modifying the LD_LIBRARY_PATH on users cpu machines. That's 
not always feasible for users and for mxnet 1.6, which introduced nvrtc, users 
typically just disable the nvrtc feature to be able to deploy the libmxnet.so 
to both cpu and gpu machines. 
   
   Why not fix the underlying problem and then enable tvmop feature?
   
   > Also, When setting -DUSE_TVM_OP=OFF the CI checks would be stuck. 
   
   That doesn't make sense as we are running CI successfully with tvm op 
disabled since a couple of months? Maybe you ran into some unrelated flakyness 
and need to retrigger the run? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] leezu commented on issue #18716: [RFC] Use TVMOp with GPU & Build without libcuda.so in CI

Reply via email to