Re: [PR] [DO NOT MERGE] Prototype Vertex MultiModal embedding handler [beam]

via GitHub Tue, 19 Aug 2025 08:51:41 -0700


jrmccluskey commented on code in PR #35677:
URL: https://github.com/apache/beam/pull/35677#discussion_r2285680282



##########
sdks/python/apache_beam/ml/transforms/embeddings/vertex_ai.py:
##########
@@ -281,3 +294,194 @@ def get_ptransform_for_processing(self, **kwargs) -> 
beam.PTransform:
     return RunInference(
         model_handler=_ImageEmbeddingHandler(self),
         inference_args=self.inference_args)
+
+
+@dataclass
+class VertexAIMultiModalInput:
+  image: Optional[Image] = None
+  video: Optional[Video] = None
+  contextual_text: Optional[str] = None

Review Comment:
   So for something like the mimeType being specifiable that would likely be 
something composed in the Image object (if you look at get_embeddings() [from 
Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/reference/python/1.71.0/vertexai.vision_models.MultiModalEmbeddingModel#vertexai_vision_models_MultiModalEmbeddingModel_get_embeddings)
 those objects are how we construct the request) and not something we'd have to 
explicitly support. If we really wanted to support that now we'd be rolling our 
own request code, which I don't think is in our best interest. 
   
   I think my main hangup on adding a bunch of wrappers around each field is 
largely around the ease of use around the transform. The whole point of adding 
the input adapters was so users wouldn't have to compose the input dataclass 
themselves, just pass the fields to us and we do the rest. I do worry about 
overcomplicating this in anticipation of _maybe_ having additional 
configuration options added to vertex later. Admittedly I am not as familiar 
with other frameworks for multimodal embeddings beyond Vertex, so maybe this is 
a reasonable expectation. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [DO NOT MERGE] Prototype Vertex MultiModal embedding handler [beam]

Reply via email to