juliethecao opened a new issue, #5041: URL: https://github.com/apache/texera/issues/5041
### Feature Summary Add a Hugging Face operator to Texera so users can run pretrained models from the Hugging Face Hub directly inside workflows. This feature makes model inference a first-class workflow step, so users can apply text, image, video and audio models without writing code. The operator would let users: - Pick a Hugging Face task such as text generation, summarization, image classification, ASR, or VQA - Browse/search available models for that task - Provide the right input column or upload media when the task requires it via property panel - Configure model-specific parameters like prompt, temperature, token limits, and output column name - Produce a workflow output that can be chained into downstream operators ### Proposed Solution or Design The operator should work as a guided, task-aware inference component rather than a generic API wrapper. The user picks a task first, then the UI only shows the fields that matter for that task. A simple flow would look like this: <img width="940" height="404" alt="Image" src="https://github.com/user-attachments/assets/20f7faf9-cf6a-49af-9b5e-0bcbfa6d2ec7" /> This is a screenshot of a selected text-generation task where the user asks a question via the input operator and the selected Hugging Face model based on the models list produces the answer as workflow output. <img width="2620" height="1418" alt="Image" src="https://github.com/user-attachments/assets/c0ef66e8-ecd4-4f64-8760-3d6ffb388238" /> This is a screenshot of a selected image-classification task where the user provides an image in the property panel and the chosen model outputs JSON predictions (predicted breeds with confidence). Here are some examples of task-based flows: - Text generation: select a prompt column, choose a model, set max tokens and temperature, get generated text in a result column - Summarization: select a text column, choose a summarization model, emit the summary - Image classification: upload or reference an image, choose an image model, output labels or captions A task-aware configuration layout could be: 1. Task 2. Model 3. Input source 4. Task-specific options 5. Result column The design should include a few key behaviors: - Model discovery and search from the Hugging Face Hub - Backend proxying for Hugging Face API calls so the UI does not talk to Hugging Face directly - API token support, with token fallback from environment or deployment config - Caching of model and task metadata to reduce repeated remote calls - Task-based validation so invalid combinations are rejected early, for example requiring an image upload for image-only tasks ### Affected Area Workflow Engine (Amber), Workflow UI, Storage / Metadata, Deployment / Infrastructure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
