[GitHub] [beam] tvalentyn commented on a diff in pull request #21818: Add Bert Language Modeling example

GitBox Wed, 15 Jun 2022 01:30:17 -0700


tvalentyn commented on code in PR #21818:
URL: https://github.com/apache/beam/pull/21818#discussion_r897687420



##########
sdks/python/apache_beam/ml/inference/pytorch_inference_it_test.py:
##########
@@ -89,6 +90,41 @@ def test_torch_run_inference_imagenet_mobilenetv2(self):
       filename, prediction = prediction.split(',')
       self.assertEqual(_EXPECTED_OUTPUTS[filename], prediction)
 
+  @pytest.mark.uses_pytorch
+  @pytest.mark.it_postcommit
+  def test_torch_run_inference_bert_for_masked_lm(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    # Path to text file containing some sentences
+    file_of_sentences = 'gs://apache-beam-ml/datasets/custom/sentences.txt'  # 
disable: line-too-long
+    output_file_dir = 'gs://apache-beam-ml/testing/predictions'

Review Comment:
   For test output, it's better to use a bucket with a lifecycle configured to 
leave less clutter behind, for example:
   
   :~$ gsutil lifecycle get gs://temp-storage-for-end-to-end-tests/
   {"rule": [{"action": {"type": "Delete"}, "condition": {"age": 14}}]}
   
   Lifecycle may be per bucket (not sure if we can configure it just for 
`./testing/predictions`), so switching outputs for all tests to 
gs://temp-storage-for-end-to-end-tests/ may be easiest.
   
   cc: @AnandInguva FYI.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] tvalentyn commented on a diff in pull request #21818: Add Bert Language Modeling example

Reply via email to