Oops!
Forgot to concatenate a space character in fixing up third line word 
wrapping. In the fixup_dialog subroutine, change the line:
$wrapped_text = $1 . $2;
to:
$wrapped_text = $1 . " " . $2;

On Monday, February 17, 2025 at 2:06:15 PM UTC-8 GP wrote:

> Regular expressions aren't well suited to handle things like checking line 
> lengths and moving line contents based upon differences in those lengths.
>
> A better method is to use something like a text filter using a scripting 
> language that can check for things like text lengths and make text string 
> changes based upon runtime evaluations.
>
> Below is a perl script text filter which will take as input a selection or 
> whole file of SRT formatted text. It will find any and all SRT sequence 
> entries with two lines of dialog text and reformat/reword wrap the lines of 
> text to a more equal line length leaving the second line longer if 
> necessary for proper word wrapping.
>
> I've named it reformat_subtitle_text.pl and saved it in BBEdit's Text 
> Filters folder so it will be listed in BBEdit's Text Filters pallet. If 
> desired you can also set a keyboard shortcut for it.
>
> You'll probably want to enhance the reformatting logic in the fixup_dialog 
> subroutine to handle cases where simple two line word wrap reformatting 
> produces awkward results. For example, what appears to be two person dialog 
> text like:
>
> - Shall I get you something, Micke?
> - No, I don't have time.
>
> or
>
> - Whose turn is it today?
> - Malin's, isn't it?
>
> with your simple word wrapping rule gets reformatted as:
>
> - Shall I get you something,
> Micke? - No, I don't have time.
>
> - Whose turn is it
> today? - Malin's, isn't it?
>
> In the SRT formatting rules I found, "-" has no defined markup rule so 
> perhaps it is just an informal convention so people are using to indicate 
> multiple people speaking.
>
> SRT formatting rules also allow simple markup annotations (e,g., bold - 
> <b> </b>) which will change the lengths of displayed text from the lengths 
> of a subtitle entry's raw dialog text. This script doesn't try to deal with 
> that complicating issue.
>
> reformat_subtitle_text.pl:
>
> #!/usr/bin/env perl
>
> use strict;
> use Text::Wrap;
> use POSIX qw/ceil/;
>
> my $subtitles = '';
>
> # regex to dissect one subtitle entry 1) sequence number and time range, 
> 2) first dialog text line,
> # and 3) second dialog text line
> my $seq_item_re = qr/(\d+\n\d{2}:\d{2}:\d{2},\d{3} --> 
> \d{2}:\d{2}:\d{2},\d{3}\n)(.+\n)(.+\n)/;
>
> # read in all the input subtitle text
> $subtitles = do { local $/; <STDIN> };
>
> # extract each and all subtitle entries with two lines of dialog text
> # and replace them with reformatted version
> $subtitles =~ s/$seq_item_re/$1 . fixup_dialog($2, $3)/mge;
>
> #output the reformatted subtitles
> print $subtitles;
>
> # reformat two lines of dialog text to have more equal line lengths with 
> line two the longer if
> # necessary for proper word wrapping
>
> sub fixup_dialog {
>     my ($line1, $line2) = @_;
>     
> #   trim trailing white space
>     $line1 =~ s/\s+$//;
>     $line2 =~ s/\s+$//;
>     
> #   ideal column width for two lines of characters without word wrapping
> #   and with word wrapping will leave second line the longer of the two 
> lines
>     my $ideal_col_width = ceil((length($line1) + length($line2))/2) + 1;
>     my $total_text = $line1 . " " . $line2 . "\n";
>     
> #   locally set wrapping parameters to not expand tabs and column width 
> constraint
>     local($Text::Wrap::unexpand) = 0;
>     local($Text::Wrap::columns) = $ideal_col_width;
>     my $wrapped_text = wrap('', '', $total_text);
>     
> #   if word wrapping creates third line move it to end of second line
>     if ( $wrapped_text =~ m/(.+\n.+)\n(.+\n)/){
>         $wrapped_text = $1 . $2;
>     }
>     return $wrapped_text;
> }
>
>
> On Sunday, February 16, 2025 at 3:37:16 AM UTC-8 Otto Munters wrote:
>
>> Is there a regex to divide the last two lines of each subtitle more 
>> evenly in the following example, so that both sentences are about the same 
>> length, with preference given to the longest sentence on the 4th line.
>> Example:
>> 351
>> 00:18:23,120 --> 00:18:29,600
>> not that likes and dislikes are 
>> your enemies
>>
>> 352
>> 00:18:29,600 --> 00:18:31,960
>> because they end up serving
>> the society.
>>
>> Thanks for your kind help! 
>> Otto
>>
>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or believe that the application isn't working correctly, please email 
"[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
<https://mastodon.social/@bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/bbedit/8335f6df-96cd-4dec-af11-d6e0297050bfn%40googlegroups.com.

Reply via email to