[GitHub] [beam] yeandy commented on a diff in pull request #22795: Fix gpu to cpu conversion with warning logs

GitBox Fri, 19 Aug 2022 07:19:49 -0700


yeandy commented on code in PR #22795:
URL: https://github.com/apache/beam/pull/22795#discussion_r950237304



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -234,16 +263,17 @@ def run_inference(
     # If elements in `batch` are provided as a dictionaries from key to 
Tensors,
     # then iterate through the batch list, and group Tensors to the same key
     key_to_tensor_list = defaultdict(list)
-    for example in batch:
-      for key, tensor in example.items():
-        key_to_tensor_list[key].append(tensor)
-    key_to_batched_tensors = {}
-    for key in key_to_tensor_list:
-      batched_tensors = torch.stack(key_to_tensor_list[key])
-      batched_tensors = _convert_to_device(batched_tensors, self._device)
-      key_to_batched_tensors[key] = batched_tensors
-    predictions = model(**key_to_batched_tensors, **inference_args)
-    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+    with torch.no_grad():

Review Comment:
   Necessary to prevent `Setting to CPU due to an exception while deserializing 
state_dict_path. Exception: CUDA out of memory. Tried to allocate 16.00 MiB 
(GPU 0; 39.59 GiB total capacity; 666.60 MiB already allocated; 13.19 MiB free; 
682.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated 
memory try setting max_split_size_mb to avoid fragmentation.  See documentation 
for Memory Management and PYTORCH_CUDA_ALLOC_CONF.` warning



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] yeandy commented on a diff in pull request #22795: Fix gpu to cpu conversion with warning logs

Reply via email to