Oops!
Forgot to concatenate a space character in fixing up third line word
wrapping. In the fixup_dialog subroutine, change the line:
$wrapped_text = $1 . $2;
to:
$wrapped_text = $1 . " " . $2;
On Monday, February 17, 2025 at 2:06:15 PM UTC-8 GP wrote:
> Regular expressions aren't well suited to handle things like checking line
> lengths and moving line contents based upon differences in those lengths.
>
> A better method is to use something like a text filter using a scripting
> language that can check for things like text lengths and make text string
> changes based upon runtime evaluations.
>
> Below is a perl script text filter which will take as input a selection or
> whole file of SRT formatted text. It will find any and all SRT sequence
> entries with two lines of dialog text and reformat/reword wrap the lines of
> text to a more equal line length leaving the second line longer if
> necessary for proper word wrapping.
>
> I've named it reformat_subtitle_text.pl and saved it in BBEdit's Text
> Filters folder so it will be listed in BBEdit's Text Filters pallet. If
> desired you can also set a keyboard shortcut for it.
>
> You'll probably want to enhance the reformatting logic in the fixup_dialog
> subroutine to handle cases where simple two line word wrap reformatting
> produces awkward results. For example, what appears to be two person dialog
> text like:
>
> - Shall I get you something, Micke?
> - No, I don't have time.
>
> or
>
> - Whose turn is it today?
> - Malin's, isn't it?
>
> with your simple word wrapping rule gets reformatted as:
>
> - Shall I get you something,
> Micke? - No, I don't have time.
>
> - Whose turn is it
> today? - Malin's, isn't it?
>
> In the SRT formatting rules I found, "-" has no defined markup rule so
> perhaps it is just an informal convention so people are using to indicate
> multiple people speaking.
>
> SRT formatting rules also allow simple markup annotations (e,g., bold -
> <b> </b>) which will change the lengths of displayed text from the lengths
> of a subtitle entry's raw dialog text. This script doesn't try to deal with
> that complicating issue.
>
> reformat_subtitle_text.pl:
>
> #!/usr/bin/env perl
>
> use strict;
> use Text::Wrap;
> use POSIX qw/ceil/;
>
> my $subtitles = '';
>
> # regex to dissect one subtitle entry 1) sequence number and time range,
> 2) first dialog text line,
> # and 3) second dialog text line
> my $seq_item_re = qr/(\d+\n\d{2}:\d{2}:\d{2},\d{3} -->
> \d{2}:\d{2}:\d{2},\d{3}\n)(.+\n)(.+\n)/;
>
> # read in all the input subtitle text
> $subtitles = do { local $/; <STDIN> };
>
> # extract each and all subtitle entries with two lines of dialog text
> # and replace them with reformatted version
> $subtitles =~ s/$seq_item_re/$1 . fixup_dialog($2, $3)/mge;
>
> #output the reformatted subtitles
> print $subtitles;
>
> # reformat two lines of dialog text to have more equal line lengths with
> line two the longer if
> # necessary for proper word wrapping
>
> sub fixup_dialog {
> my ($line1, $line2) = @_;
>
> # trim trailing white space
> $line1 =~ s/\s+$//;
> $line2 =~ s/\s+$//;
>
> # ideal column width for two lines of characters without word wrapping
> # and with word wrapping will leave second line the longer of the two
> lines
> my $ideal_col_width = ceil((length($line1) + length($line2))/2) + 1;
> my $total_text = $line1 . " " . $line2 . "\n";
>
> # locally set wrapping parameters to not expand tabs and column width
> constraint
> local($Text::Wrap::unexpand) = 0;
> local($Text::Wrap::columns) = $ideal_col_width;
> my $wrapped_text = wrap('', '', $total_text);
>
> # if word wrapping creates third line move it to end of second line
> if ( $wrapped_text =~ m/(.+\n.+)\n(.+\n)/){
> $wrapped_text = $1 . $2;
> }
> return $wrapped_text;
> }
>
>
> On Sunday, February 16, 2025 at 3:37:16 AM UTC-8 Otto Munters wrote:
>
>> Is there a regex to divide the last two lines of each subtitle more
>> evenly in the following example, so that both sentences are about the same
>> length, with preference given to the longest sentence on the 4th line.
>> Example:
>> 351
>> 00:18:23,120 --> 00:18:29,600
>> not that likes and dislikes are
>> your enemies
>>
>> 352
>> 00:18:29,600 --> 00:18:31,960
>> because they end up serving
>> the society.
>>
>> Thanks for your kind help!
>> Otto
>>
>
--
This is the BBEdit Talk public discussion group. If you have a feature request
or believe that the application isn't working correctly, please email
"[email protected]" rather than posting here. Follow @bbedit on Mastodon:
<https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/bbedit/8335f6df-96cd-4dec-af11-d6e0297050bfn%40googlegroups.com.