SaiShashank12 opened a new pull request, #36369: URL: https://github.com/apache/beam/pull/36369
## Title: #36368 **Add Triton Inference Server support for RunInference transform** ## Issue Description: ### Summary This PR adds support for Triton Inference Server in Apache Beam’s RunInference transform by implementing a `TritonModelHandler` class. ### What does this PR do? - Implements `TritonModelHandler` that extends `ModelHandler[str, PredictionResult, Model]` - Enables inference on text data using Triton Inference Server models - Supports batch processing of text strings through the Beam pipeline - Handles model loading, initialization, and inference execution with Triton server ### Key Features - **Model Loading**: Initializes Triton server with configurable model repository and model name - **Batch Inference**: Processes sequences of text strings efficiently - **Result Handling**: Parses JSON responses from Triton and returns structured `PredictionResult` objects - **Flexible Configuration**: Supports custom inference arguments ### Use Case This handler allows users to leverage Triton Inference Server’s optimized inference capabilities within Apache Beam pipelines, particularly useful for: - Text classification tasks - Document processing pipelines - Real-time and batch ML inference workloads ### Testing - [ ] Unit tests added - [ ] Integration tests with Triton server - [ ] Documentation updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
