[GitHub] [beam] tvalentyn commented on a diff in pull request #22795: Fix gpu to cpu conversion with warning logs

GitBox Fri, 19 Aug 2022 13:35:06 -0700


tvalentyn commented on code in PR #22795:
URL: https://github.com/apache/beam/pull/22795#discussion_r950534882



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -40,11 +41,30 @@
 def _load_model(
     model_class: torch.nn.Module, state_dict_path, device, **model_params):
   model = model_class(**model_params)
-  model.to(device)
+
+  if device == torch.device('cuda') and not torch.cuda.is_available():
+    logging.warning(
+        "Specified 'GPU', but could not find device. Switching to CPU.")

Review Comment:
   We could add some details where this configuration was made. How about sth 
like:
   
           "Model handler specified a 'GPU' device, but GPUs are not available. 
Switching to CPU.")
   



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -234,16 +263,17 @@ def run_inference(
     # If elements in `batch` are provided as a dictionaries from key to 
Tensors,
     # then iterate through the batch list, and group Tensors to the same key
     key_to_tensor_list = defaultdict(list)
-    for example in batch:
-      for key, tensor in example.items():
-        key_to_tensor_list[key].append(tensor)
-    key_to_batched_tensors = {}
-    for key in key_to_tensor_list:
-      batched_tensors = torch.stack(key_to_tensor_list[key])
-      batched_tensors = _convert_to_device(batched_tensors, self._device)
-      key_to_batched_tensors[key] = batched_tensors
-    predictions = model(**key_to_batched_tensors, **inference_args)
-    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+    with torch.no_grad():

Review Comment:
   Let's capture this information in a way that makes it easier to find for 
someone reading the code.
   
   Ideally, there would be a GH issue mentioning the error `Setting to CPU due 
to an exception...`, and you add a comment and reference the issue, and link 
this PR to the issue.
   
   Another option is to capture this info in a pull request description for 
this PR (slightly more difficult to find, but easier than trying to find  this 
review comment chain).
   
   



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -40,11 +41,30 @@
 def _load_model(
     model_class: torch.nn.Module, state_dict_path, device, **model_params):
   model = model_class(**model_params)
-  model.to(device)
+
+  if device == torch.device('cuda') and not torch.cuda.is_available():
+    logging.warning(
+        "Specified 'GPU', but could not find device. Switching to CPU.")
+    device = torch.device('cpu')
+
   file = FileSystems.open(state_dict_path, 'rb')
-  model.load_state_dict(torch.load(file))
+  try:
+    logging.info("Reading state_dict_path %s onto %s", state_dict_path, device)

Review Comment:
   ```suggestion
       logging.info("Loading state_dict_path %s onto a %s device", 
state_dict_path, device)
   ```



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -40,11 +41,30 @@
 def _load_model(
     model_class: torch.nn.Module, state_dict_path, device, **model_params):
   model = model_class(**model_params)
-  model.to(device)
+
+  if device == torch.device('cuda') and not torch.cuda.is_available():
+    logging.warning(
+        "Specified 'GPU', but could not find device. Switching to CPU.")
+    device = torch.device('cpu')
+
   file = FileSystems.open(state_dict_path, 'rb')
-  model.load_state_dict(torch.load(file))
+  try:
+    logging.info("Reading state_dict_path %s onto %s", state_dict_path, device)
+    state_dict = torch.load(file, map_location=device)
+  except RuntimeError as e:
+    message = "Setting to CPU due to an exception while deserializing" \

Review Comment:
   How about:
   
   
   Loading the model onto a GPU device failed due to an exception: 
   ...
   
   Attempting to load onto a CPU device instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] tvalentyn commented on a diff in pull request #22795: Fix gpu to cpu conversion with warning logs

Reply via email to