[GitHub] [madlib] kaknikhil opened a new pull request #383: Dl/remove hardcode

GitBox Thu, 02 May 2019 12:05:15 -0700

kaknikhil opened a new pull request #383: Dl/remove hardcode
URL: https://github.com/apache/madlib/pull/383
 
 
   Previous PR for reference https://github.com/apache/madlib/pull/379
   
   JIRA: MADLIB-1308
   
   Previously, gpus_per_host were hard coded to 4. This commit removes this
   hard coding and takes in this value from the user.
   
   We also tried to use the tensorflow function list_local_devices to get
   the count of gpus per host. This did give us the count but would hang
   forever on some segments. So we decided to not use this function.
   
   We now cache the CUDA_VISIBLE_DEVICES env variable (which is set to
   -1 for master) and then reset it at the end of fit function.
   
   Finally, we dynamically calculate the gpu memory fraction to support
   the case when the number of gpus is less than the number of segments.
   
   
   Also fixed a bug in get device name function for postgres


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [madlib] kaknikhil opened a new pull request #383: Dl/remove hardcode

Reply via email to