Thanks you for your help Mark and GP.

I succeeded by making a textfactory, which repeats a series of greps.
search: (?=^.{82}$)(.{29,42}\b)(.*)
replace: \1\n\2
the first grep is repeated with: (?=^.{81}$)(.{29,42}\b)(.*) 
 until (?=^.{43}$)(.{10,24}\b)(.*)
so that all lines of length 43 to 82 are converted into two lines of 
approximately equal size.

Best regards, Otto 
Op maandag 17 februari 2025 om 23:22:36 UTC+1 schreef GP:

> Oops!
> Forgot to concatenate a space character in fixing up third line word 
> wrapping. In the fixup_dialog subroutine, change the line:
> $wrapped_text = $1 . $2;
> to:
> $wrapped_text = $1 . " " . $2;
>
> On Monday, February 17, 2025 at 2:06:15 PM UTC-8 GP wrote:
>
>> Regular expressions aren't well suited to handle things like checking 
>> line lengths and moving line contents based upon differences in those 
>> lengths.
>>
>> A better method is to use something like a text filter using a scripting 
>> language that can check for things like text lengths and make text string 
>> changes based upon runtime evaluations.
>>
>> Below is a perl script text filter which will take as input a selection 
>> or whole file of SRT formatted text. It will find any and all SRT sequence 
>> entries with two lines of dialog text and reformat/reword wrap the lines of 
>> text to a more equal line length leaving the second line longer if 
>> necessary for proper word wrapping.
>>
>> I've named it reformat_subtitle_text.pl and saved it in BBEdit's Text 
>> Filters folder so it will be listed in BBEdit's Text Filters pallet. If 
>> desired you can also set a keyboard shortcut for it.
>>
>> You'll probably want to enhance the reformatting logic in the 
>> fixup_dialog subroutine to handle cases where simple two line word wrap 
>> reformatting produces awkward results. For example, what appears to be two 
>> person dialog text like:
>>
>> - Shall I get you something, Micke?
>> - No, I don't have time.
>>
>> or
>>
>> - Whose turn is it today?
>> - Malin's, isn't it?
>>
>> with your simple word wrapping rule gets reformatted as:
>>
>> - Shall I get you something,
>> Micke? - No, I don't have time.
>>
>> - Whose turn is it
>> today? - Malin's, isn't it?
>>
>> In the SRT formatting rules I found, "-" has no defined markup rule so 
>> perhaps it is just an informal convention so people are using to indicate 
>> multiple people speaking.
>>
>> SRT formatting rules also allow simple markup annotations (e,g., bold - 
>> <b> </b>) which will change the lengths of displayed text from the lengths 
>> of a subtitle entry's raw dialog text. This script doesn't try to deal with 
>> that complicating issue.
>>
>> reformat_subtitle_text.pl:
>>
>> #!/usr/bin/env perl
>>
>> use strict;
>> use Text::Wrap;
>> use POSIX qw/ceil/;
>>
>> my $subtitles = '';
>>
>> # regex to dissect one subtitle entry 1) sequence number and time range, 
>> 2) first dialog text line,
>> # and 3) second dialog text line
>> my $seq_item_re = qr/(\d+\n\d{2}:\d{2}:\d{2},\d{3} --> 
>> \d{2}:\d{2}:\d{2},\d{3}\n)(.+\n)(.+\n)/;
>>
>> # read in all the input subtitle text
>> $subtitles = do { local $/; <STDIN> };
>>
>> # extract each and all subtitle entries with two lines of dialog text
>> # and replace them with reformatted version
>> $subtitles =~ s/$seq_item_re/$1 . fixup_dialog($2, $3)/mge;
>>
>> #output the reformatted subtitles
>> print $subtitles;
>>
>> # reformat two lines of dialog text to have more equal line lengths with 
>> line two the longer if
>> # necessary for proper word wrapping
>>
>> sub fixup_dialog {
>>     my ($line1, $line2) = @_;
>>     
>> #   trim trailing white space
>>     $line1 =~ s/\s+$//;
>>     $line2 =~ s/\s+$//;
>>     
>> #   ideal column width for two lines of characters without word wrapping
>> #   and with word wrapping will leave second line the longer of the two 
>> lines
>>     my $ideal_col_width = ceil((length($line1) + length($line2))/2) + 1;
>>     my $total_text = $line1 . " " . $line2 . "\n";
>>     
>> #   locally set wrapping parameters to not expand tabs and column width 
>> constraint
>>     local($Text::Wrap::unexpand) = 0;
>>     local($Text::Wrap::columns) = $ideal_col_width;
>>     my $wrapped_text = wrap('', '', $total_text);
>>     
>> #   if word wrapping creates third line move it to end of second line
>>     if ( $wrapped_text =~ m/(.+\n.+)\n(.+\n)/){
>>         $wrapped_text = $1 . $2;
>>     }
>>     return $wrapped_text;
>> }
>>
>>
>> On Sunday, February 16, 2025 at 3:37:16 AM UTC-8 Otto Munters wrote:
>>
>>> Is there a regex to divide the last two lines of each subtitle more 
>>> evenly in the following example, so that both sentences are about the same 
>>> length, with preference given to the longest sentence on the 4th line.
>>> Example:
>>> 351
>>> 00:18:23,120 --> 00:18:29,600
>>> not that likes and dislikes are 
>>> your enemies
>>>
>>> 352
>>> 00:18:29,600 --> 00:18:31,960
>>> because they end up serving
>>> the society.
>>>
>>> Thanks for your kind help! 
>>> Otto
>>>
>>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or believe that the application isn't working correctly, please email 
"[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
<https://mastodon.social/@bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/bbedit/c2d2b594-3133-4a3e-b518-5b59403c3ae1n%40googlegroups.com.

Reply via email to