SaiShashank12 opened a new pull request, #36369:
URL: https://github.com/apache/beam/pull/36369

   
   ## Title: #36368
   
   **Add Triton Inference Server support for RunInference transform**
   
   ## Issue Description:
   
   ### Summary
   
   This PR adds support for Triton Inference Server in Apache Beam’s 
RunInference transform by implementing a `TritonModelHandler` class.
   
   ### What does this PR do?
   
   - Implements `TritonModelHandler` that extends `ModelHandler[str, 
PredictionResult, Model]`
   - Enables inference on text data using Triton Inference Server models
   - Supports batch processing of text strings through the Beam pipeline
   - Handles model loading, initialization, and inference execution with Triton 
server
   
   ### Key Features
   
   - **Model Loading**: Initializes Triton server with configurable model 
repository and model name
   - **Batch Inference**: Processes sequences of text strings efficiently
   - **Result Handling**: Parses JSON responses from Triton and returns 
structured `PredictionResult` objects
   - **Flexible Configuration**: Supports custom inference arguments
   
   ### Use Case
   
   This handler allows users to leverage Triton Inference Server’s optimized 
inference capabilities within Apache Beam pipelines, particularly useful for:
   
   - Text classification tasks
   - Document processing pipelines
   - Real-time and batch ML inference workloads
   
   ### Testing
   
   - [ ] Unit tests added
   - [ ] Integration tests with Triton server
   - [ ] Documentation updated
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to