andrewmusselman opened a new pull request, #1323:
URL: https://github.com/apache/mahout/pull/1323

   Fixes Issue #1320 .
    
   Adds `_select_torch_device(torch, device_id)` in `qumat_qdp.loader`. It:
   - returns `"cpu"` when CUDA isn't available,
   - raises `ValueError` on out-of-range `device_id` (preserves prior
     contract),
   - checks `torch.cuda.get_device_capability(device_id)` against
     `torch.cuda.get_arch_list()` and falls back to `"cpu"` with a clear
     `warnings.warn` when the device's `sm_NN` isn't in the list,
   - otherwise returns `f"cuda:{device_id}"`.
   Both `qumat_qdp.loader.QuantumDataLoader._create_pytorch_iterator` and
   `qumat_qdp.api.QdpBenchmark._run_throughput_pytorch` use the new helper
   (the latter previously duplicated the same incomplete selection logic).
    
   `testing/qdp_python/test_torch_ref.py` gets a mirror helper
   `_torch_cuda_usable()` used by the two `@skipif`s that previously only
   checked `is_available()`.
    
   After this change, on an incompatible GPU:
   - 8 pytorch-backend loader tests + 4 benchmark tests silently fall back
     to CPU and pass (each emits one `UserWarning`),
   - the 5 explicit GPU tests skip with `"CUDA not available or GPU
     compute capability not supported by this PyTorch build"`.
   Verified on Linux + GTX 1060 (sm_61) with PyTorch wheel targeting
   sm_70+: tests that previously errored with
   `cudaErrorNoKernelImageForDevice` now pass or skip cleanly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to