PG1204 opened a new issue, #5134:
URL: https://github.com/apache/texera/issues/5134
### Task Summary
### Feature Summary
The HuggingFace inference operator (#5041) needs a small backend REST
surface to support its frontend UI. Without these endpoints, the operator's
property panel can't populate the model picker, audio-input tasks can't accept
user uploads, and inference responses that link to remote media (images, audio,
video on HF / Fal / Replicate CDNs) can't be previewed in the workspace due to
browser CORS.
This issue covers introducing `HuggingFaceModelResource` and registering it
on the web application. It is the backend foundation that subsequent child
issues — the operator class, the property panel, result-panel media rendering,
and developer docs — depend on.
Concretely, publishing these endpoints would enable:
- The operator UI's model picker (browse HF models per pipeline task; search
by name).
- Audio uploads for tasks like automatic speech recognition and audio
classification, with the uploaded clip streamable back to the browser for
preview.
- Inline display of HF inference response media in the result panel, by
proxying allowlisted remote URLs through Texera (bypassing browser CORS).
### Proposed Solution or Design
1. Add `HuggingFaceModelResource` (Jersey REST resource,
`@Path("/huggingface")`) exposing five endpoints:
- `GET /api/huggingface/models?task=…[&search=…]` — browse or search HF
models for a pipeline task.
- `GET /api/huggingface/tasks` — list HF pipeline tags with hosted
inference.
- `POST /api/huggingface/upload-audio?filename=…` — stream-upload an
audio file.
- `GET /api/huggingface/audio-preview?path=…` — stream an uploaded audio
file back to the browser.
- `GET /api/huggingface/media-proxy?url=…` — proxy an allowlisted remote
media URL.
2. Register the resource in `TexeraWebApplication`.
3. Design constraints baked into the resource:
- **Token sourcing:** user's HF token forwarded via the `X-HF-Token`
request header from the operator panel; anonymous fallback for unauthenticated
browsing. No server-side env-var token.
- **Caching:** bounded Guava `Cache` (size + 1 h TTL) for browse and
tasks endpoints; user-token requests bypass the cache to keep private-model
visibility per-user.
- **Streaming upload:** `InputStream`-based with a 25 MiB cap and
extension allowlist (`.wav`, `.mp3`, `.flac`, …); non-audio extensions rejected
before disk write.
- **SSRF protection:** allowlist on `/media-proxy` (`huggingface.co`,
`fal.media`, `replicate.delivery`, `replicate.com`) with a leading-dot suffix
guard against lookalike domains.
- **Bounded fan-out:** the per-task probe in `/tasks` runs on a dedicated
`ForkJoinPool(4)` instead of the JVM common pool, with explicit 429/503 WARN
logging.
- **Truncation visibility:** browse and search responses carry an
`X-Texera-Truncated: true` header when a server-side cap is hit (`MAX_PAGES=50`
for browse, `SEARCH_LIMIT=100` for search).
References:
- Parent issue: #5041
- Pull request: #5124
- HF Hub API: https://huggingface.co/docs/hub/api
### Impact / Priority
(P2) Medium – required for the HuggingFace inference operator (#5041) to
function. Does not affect existing functionality.
### Affected Area
Workflow Engine (Amber) — backend REST layer.
### Task Type
- [ ] Refactor / Cleanup
- [ ] DevOps / Deployment / CI
- [ ] Testing / QA
- [ ] Documentation
- [ ] Performance
- [x] Other
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]