kaknikhil opened a new pull request #383: Dl/remove hardcode URL: https://github.com/apache/madlib/pull/383 Previous PR for reference https://github.com/apache/madlib/pull/379 JIRA: MADLIB-1308 Previously, gpus_per_host were hard coded to 4. This commit removes this hard coding and takes in this value from the user. We also tried to use the tensorflow function list_local_devices to get the count of gpus per host. This did give us the count but would hang forever on some segments. So we decided to not use this function. We now cache the CUDA_VISIBLE_DEVICES env variable (which is set to -1 for master) and then reset it at the end of fit function. Finally, we dynamically calculate the gpu memory fraction to support the case when the number of gpus is less than the number of segments. Also fixed a bug in get device name function for postgres
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
