vLLM skill runtime for sovereign and air-gapped pilots [airflow-steward]

via GitHub Tue, 26 May 2026 08:37:28 -0700


potiuk opened a new issue, #315:
URL: https://github.com/apache/airflow-steward/issues/315


   **Goal:** Define and land a skill runtime that works against **local LLMs** 
(Ollama, llama.cpp, vLLM, LM Studio, Jan). Grounds: [RFC-AI-0004 Principle 3 — 
Vendor neutrality](../tree/main/docs/rfcs/RFC-AI-0004.md), which explicitly 
names "a local-Ollama wrapper" as a target runtime, and RFC-AI-0004's "no 
cloud-only skills" consequence.
   
   **Why local LLM:**
   
   - **Sovereign deployments** — projects bound by data-residency rules (EU, 
government, defence-orbit OSS) that can't send issue / mail content to a US 
cloud API
   - **Air-gapped triage** — security teams that need to assess 
`<security-list>` traffic without an internet connection
   - **Cost ceiling** — projects unable to commit to a per-token cloud bill, 
but happy to run a 70B model on shared infra
   
   **What "parity" means** (calibrated for what local LLMs can realistically 
do):
   
   - Skills under `.claude/skills/<name>/SKILL.md` are invokable via a 
local-LLM-backed agent loop
   - The `tools/*` bridges are reachable — these are language-agnostic CLI 
calls, no LLM context needed
   - Sandbox / HITL primitives map to the same `bubblewrap` baseline; the 
local-LLM runtime is the new piece
   - **Realistic scoping:** some skills (multi-step reasoning over long 
contexts — `security-issue-triage`, `pr-management-code-review`) may need a 
high-capability local model (Llama 70B, Qwen 72B, DeepSeek). Document the 
per-skill model-size floor so adopters can pick a model that actually works.
   
   **Suggested approach:**
   
   - Pick an agent-loop frontend that supports local LLMs natively — 
candidates: `aider --model ollama/...`, `goose` with a local backend, 
`continue.dev`, a fresh thin wrapper around `llama.cpp`'s OpenAI-compatible API
   - Document the model-size floor empirically (per skill, against the existing 
eval suite at [`tools/skill-evals/`](../tree/main/tools/skill-evals/))
   - Land a `setup-local-llm` skill family alongside `setup-isolated-*` for the 
runtime-side install
   
   **Reference:**
   
   - RFC-AI-0004: 
[`docs/rfcs/RFC-AI-0004.md`](../tree/main/docs/rfcs/RFC-AI-0004.md)
   - Existing skill shape: [`.claude/skills/`](../tree/main/.claude/skills/)
   - Eval suite (for model-floor calibration): 
[`tools/skill-evals/`](../tree/main/tools/skill-evals/)
   - Ollama: https://ollama.com
   - llama.cpp: https://github.com/ggml-org/llama.cpp (OpenAI-compatible server)
   - vLLM: https://github.com/vllm-project/vllm
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] feat(adapter/local-llm): Ollama / llama.cpp / vLLM skill runtime for sovereign and air-gapped pilots [airflow-steward]

Reply via email to