[FFmpeg-devel] [PR] avfilter/af_whisper.c: Set split_on_word (PR #22553)

WyattBlue via ffmpeg-devel Thu, 19 Mar 2026 15:06:35 -0700

PR #22553 opened by WyattBlue
URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/22553
Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/22553.patch


This prevents `max_len` splitting via tokens, which splits words
like "don't" and proper nouns inappropriately.


>From d1589c69b25810d828d0bec969c5234e6cc8b050 Mon Sep 17 00:00:00 2001
From: WyattBlue <[email protected]>
Date: Thu, 19 Mar 2026 18:01:45 -0400
Subject: [PATCH] avfilter/af_whisper.c: Set split_on_word

This prevents `max_len` splitting via tokens, which splits words
like "don't" and proper nouns inappropriately.
---
 doc/filters.texi         | 4 ++--
 libavfilter/af_whisper.c | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index 569ff516d4..3b9fd893e0 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -7775,8 +7775,8 @@ Default value: @code{"text"}
 
 @item max_len
 Maximum segment length in characters. When set to a value greater than 0,
-transcription segments will be split to not exceed this length. This is useful
-for generating subtitles with shorter lines.
+transcription segments will be split by word to not exceed this length. This is
+useful for generating subtitles with shorter lines.
 Default value: @code{"0"}
 
 @item vad_model
diff --git a/libavfilter/af_whisper.c b/libavfilter/af_whisper.c
index 299a8bca7a..e5723b30cf 100644
--- a/libavfilter/af_whisper.c
+++ b/libavfilter/af_whisper.c
@@ -207,6 +207,7 @@ static void run_transcription(AVFilterContext *ctx, AVFrame 
*frame, int samples)
     params.print_timestamps = 0;
     params.max_len = wctx->max_len;
     params.token_timestamps = (wctx->max_len > 0);
+    params.split_on_word = (wctx->max_len > 0);
 
     if (whisper_full(wctx->ctx_wsp, params, wctx->audio_buffer, samples) != 0) 
{
         av_log(ctx, AV_LOG_ERROR, "Failed to process audio with 
whisper.cpp\n");
-- 
2.52.0

_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[FFmpeg-devel] [PR] avfilter/af_whisper.c: Set split_on_word (PR #22553)

Reply via email to