jrmccluskey commented on code in PR #31536:
URL: https://github.com/apache/beam/pull/31536#discussion_r1631269292


##########
sdks/python/apache_beam/ml/transforms/embeddings/huggingface.py:
##########
@@ -153,6 +154,45 @@ def get_ptransform_for_processing(self, **kwargs) -> 
beam.PTransform:
         ))
 
 
+class SentenceTransformerImageEmbeddings(EmbeddingsManager):
+  def __init__(self, model_name: str, columns: List[str], **kwargs):
+    """
+    Embedding config for sentence-transformers. This config can be used with
+    MLTransform to embed image data. Models are loaded using the RunInference
+    PTransform with the help of ModelHandler.
+
+    Args:
+      model_name: Name of the model to use. The model should be hosted on
+        HuggingFace Hub or compatible with sentence_transformers. See
+        
https://www.sbert.net/docs/sentence_transformer/pretrained_models.html#image-text-models
 # pylint: disable=line-too-long
+        for a list of sentence_transformers models.
+      columns: List of columns to be embedded.
+      min_batch_size: The minimum batch size to be used for inference.

Review Comment:
   this is a bit of a weird case where those parameters are passed up as kwargs 
and handled by the `EmbeddingsManager`. I'd be okay to explicitly have these in 
the constructor though



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to