KellenSunderland opened a new pull request #13898: Fix launch bounds in spatial 
transformer
URL: https://github.com/apache/incubator-mxnet/pull/13898
 
 
   ## Description ##
   Without __launch_bounds__ compiler is not required to use small enough 
number of registers to fit 1024 threads per block. Our internal CI with CUDA 10 
build was failing on V100 because of this.
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR:
   - [x] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [x] Added __launch_bounds__ guards around 
BilinearSampling[Forward,Backward]Kernel to ensure that the compiled operator 
works on each supported GPU.
   
   ## Comments ##
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to