yeandy commented on code in PR #22795:
URL: https://github.com/apache/beam/pull/22795#discussion_r950237304
##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -234,16 +263,17 @@ def run_inference(
# If elements in `batch` are provided as a dictionaries from key to
Tensors,
# then iterate through the batch list, and group Tensors to the same key
key_to_tensor_list = defaultdict(list)
- for example in batch:
- for key, tensor in example.items():
- key_to_tensor_list[key].append(tensor)
- key_to_batched_tensors = {}
- for key in key_to_tensor_list:
- batched_tensors = torch.stack(key_to_tensor_list[key])
- batched_tensors = _convert_to_device(batched_tensors, self._device)
- key_to_batched_tensors[key] = batched_tensors
- predictions = model(**key_to_batched_tensors, **inference_args)
- return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+ with torch.no_grad():
Review Comment:
Necessary to prevent `Setting to CPU due to an exception while deserializing
state_dict_path. Exception: CUDA out of memory. Tried to allocate 16.00 MiB
(GPU 0; 39.59 GiB total capacity; 666.60 MiB already allocated; 13.19 MiB free;
682.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated
memory try setting max_split_size_mb to avoid fragmentation. See documentation
for Memory Management and PYTORCH_CUDA_ALLOC_CONF.` warning
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]