Package: llama.cpp
Version: 8941+dfsg-1
Tags: patch
The current edition of SimpleChat have problems with models sending
thinking strings in streaming mode. This patch fixes the problem:
diff --git a/debian/simplechat-legacy-ui/simplechat.js
b/debian/simplechat-legacy-ui/simplechat.js
index 690968b..f74feb7 100644
--- a/debian/simplechat-legacy-ui/simplechat.js
+++ b/debian/simplechat-legacy-ui/simplechat.js
@@ -149,7 +149,9 @@ class SimpleChat {
* @param {string} content
*/
append_response(content) {
- this.latestResponse += content;
+ if (content != null && typeof content === 'string') {
+ this.latestResponse += content;
+ }
}
/**
@@ -304,14 +306,19 @@ class SimpleChat {
response_extract_stream(respBody, apiEP) {
let assistant = "";
if (apiEP == ApiEP.Type.Chat) {
- if (respBody["choices"][0]["finish_reason"] !== "stop") {
- assistant = respBody["choices"][0]["delta"]["content"];
+ const choice = respBody["choices"]?.[0];
+ if (!choice || !choice.delta || choice["finish_reason"] ===
"stop") {
+ return assistant;
+ }
+ if (choice.delta.reasoning_content != null) {
+ assistant = choice.delta.reasoning_content;
+ } else if (choice.delta.content != null) {
+ assistant = choice.delta.content;
}
} else {
- try {
- assistant = respBody["choices"][0]["text"];
- } catch {
- assistant = respBody["content"];
+ const text = respBody["choices"]?.[0]?.["text"] ??
respBody["content"];
+ if (text != null) {
+ assistant = String(text);
}
}
return assistant;
@@ -410,7 +417,12 @@ class SimpleChat {
if (curLine.trim() === "[DONE]") {
break;
}
- let curJson = JSON.parse(curLine);
+ let curJson;
+ try {
+ curJson = JSON.parse(curLine);
+ } catch(e) {
+ continue;
+ }
console.debug("DBUG:SC:PART:Json:", curJson);
this.append_response(this.response_extract_stream(curJson,
apiEP));
}
I asked a bullshit generator (aka OpenCode using my llama.cpp instance
with the Qwen 3.6 model) to investigate and come up with a fix, and the
above tested patch was its solution. I further asked it to write a
summary of the task at hand and its findings, which seem to match the
patch and confirm my understanding of what is going wrong in SimpleChat.
Here is the proivded summary (I wrapped some long lines for better
formatting and reading):
# Task: Fix simplechat streaming "null"/"undefined" output + show reasoning
content
## Problem
Web interface at local instance of llama.cpp SimpleChat web interface
shows strings of `null` and `undefined` in chat responses before proper
text appears. Simple requests like "Where is London?" are affected.
## Root Cause (confirmed via live API probe)
Probing the live API with curl revealed the SSE stream structure:
```
data:
{"choices":[{"delta":{"role":"assistant","content":null},"finish_reason":null}]}
data: {"choices":[{"delta":{"reasoning_content":"'s"}},"finish_reason":null}]}
data: {"choices":[{"delta":{"reasoning_content":" a"}},"finish_reason":null}]}
... (many reasoning chunks) ...
data: {"choices":[{"delta":{"content":"London is"}},"finish_reason":null}]}
```
**Three problems in `response_extract_stream()` at simplechat.js:304:**
1. **First chunk has `"content": null`** — old code did `assistant =
delta["content"]` → assigned literal JS `null`, which stringifies to
`"null"` when concatenated into `latestResponse`.
2. **Reasoning chunks have NO `content` key at all** — they only have
`reasoning_content`. Accessing `delta["content"]` yields `undefined`,
which stringifies to `"undefined"`. This produces a long stream of
"undefined" while the model is thinking.
3. **Completion (non-chat) path** used try/catch with no null guard on
fallback — could also produce null/undefined values.
## Fixes Applied
### 1. `response_extract_stream()` — chat path (line 306-325)
Now guards against null delta, skips stop chunks early, and handles both
`reasoning_content` AND `content`:
```javascript
const choice = respBody["choices"]?.[0];
if (!choice || !choice.delta || choice["finish_reason"] === "stop") {
return assistant;
}
if (choice.delta.reasoning_content != null) {
assistant = choice.delta.reasoning_content;
} else if (choice.delta.content != null) {
assistant = choice.delta.content;
}
```
### 2. `response_extract_stream()` — completion path (line 318-322)
Uses optional chaining + null guard:
```javascript
const text = respBody["choices"]?.[0]?.["text"] ?? respBody["content"];
if (text != null) {
assistant = String(text);
}
```
### 3. `append_response()` — defensive guard (line 151-154)
Prevents any non-string from being concatenated:
```javascript
if (content != null && typeof content === 'string') {
this.latestResponse += content;
}
```
### 4. SSE JSON parsing — try/catch for malformed lines (line 417-420)
Gracefully skips unparseable SSE data lines instead of crashing.
## Files Modified
- `debian/simplechat-legacy-ui/simplechat.js`
--
Happy hacking
Petter Reinholdtsen