Hi, although this is solved, here another possible way using a text factory.
The ratio (text factory steps) would be this: 1) Replace every line break with an placeholder (e.g. <br>): " \n(?!Annotation)" => "<br>" (this features "Positional Assertions" <= RTM) => effect: every record is on it's own line 2) Sort lines (using "Sort using pattern") – Searching pattern = "Created at: (.+?)<br>Author: .+\s(.+?)<br>", Specific sub-patterns = "\2\1" 3) Re-Replace your placeholder with real line breaks: TF-step Replace all: " <br>" => "\n" Caveat: this assumes: 1) "Annotation" will always be the first field in a record; 2) "Created at" will always occur before "Author"; 3) the sequence "<br>" will not occur in the original text (change the placeholder if so) It worked on your sample data, Regards Roland On Wed, Nov 6, 2024 at 1:45 PM Howard <[email protected]> wrote: > Thanks everyone for your responses. They enabled me to solve the problem. > Howard > On Tuesday 5 November 2024 at 1:13:46 pm UTC-5 jj wrote: > >> 1. here is the find/replace dialog: >> [image: Screenshot 2024-11-05 at 18.50.03.png] >> 2. Here is what the file should look like once the replacement has been >> done: >> (the text options are displayed by clicking on the ⚙️ on the left of the >> navigation bar) >> [image: Screenshot 2024-11-05 at 18.49.22.png] >> >> Notice that the <TAB> tags where replaced by tab characters (the little >> triangles Δ only visible when Show invisibles > Show tabs is checked). >> >> The conversion should be immediate. >> >> Transforming your example to this Tab-Separated-Values result: >> >> [image: Screenshot 2024-11-05 at 19.00.50.png] >> (Notice that the invisible tab characters are visible here because Show >> invisibles > Show tabs is checked for this file too ) >> >> HTH >> >> Jean Jourdain >> >> On Tuesday, November 5, 2024 at 12:37:59 PM UTC+1 Howard wrote: >> >>> Jean, I tried to apply your canonize_lines_to_columns.txt file to the >>> data shown earlier in this post, following your directions; however, after >>> letting it run for a few minutes, nothing happened. I had to Force Quit. >>> >>> I had to manually change every <TAB> to `\t` and the BBEdit Find/Replace >>> wouldn't do it. In BBEdit Settings, the "Auto-expand tabs" option was off. >>> How do I deselect that option? >>> >>> How long should it take for the data to be converted? What could I be >>> doing wrong? >>> >>> Howard >>> >>> On Monday 4 November 2024 at 5:15:56 pm UTC-5 jj wrote: >>> >>>> Hi Howard, >>>> >>>> You could do that with a canonize file and a few regular espressions. >>>> >>>> 1. Create a new file named canonize_lines_to_columns.txt with this >>>> content: >>>> >>>> # -*- x-bbedit-canon-case-sensitive: 1; x-bbedit-canon-match-words: 0; >>>> x-bbedit-canon-grep: 1; -*- >>>> # End: >>>> # Local Variables: >>>> # coding: utf-8 >>>> # indent_style: tab >>>> #=== >>>> # Save the annotation number in a <<< >>> bracket that we will need >>>> later. >>>> >>>> * ^Annotation (\d+):<TAB><<<\1>>>* # Replace all the column titles by >>>> tabs. >>>> >>>> * \n(Created at|Author|Type|Comment):\h*<TAB>\t* # Replace all the >>>> newlines by a space in case there are some in the contents of Comment >>>> fields. >>>> >>>> * \n<TAB>\x20* # Put a newline before each annotation number and >>>> remove the <<< >>> bracket. >>>> >>>> * <<<(\d+)>>><TAB>\n\1* # Put the column names in the first line. >>>> >>>> * \A\h*$<TAB>Annotation\tCreated At\tAuthor\tType\tComment* # Put a >>>> single space where there is more that one. >>>> >>>> * \x20{2,}<TAB>\x20* # Reorder the columns to Author, Created At, >>>> Annotation, Type, Comment. >>>> >>>> * ^(.+?)\t(.+?)\t(.+?)\t<TAB>\3\t\2\t\1\t* # Now the file could be >>>> sorted in BBEdit by Author, Created At >>>> # Or imported into a Spreadsheet as Tab Separated Values. >>>> >>>> 2. Once you have created this file replace in it all the <TAB> by >>>> real tabs. Take care to deselect the "Auto-expand tabs" option on the file >>>> before you save it otherwise they will be replaced by spaces and we need >>>> them as separators. >>>> >>>> Find: <TAB> >>>> Replace: \t >>>> >>>> 3. Go to your data file and use the menu Text > Canonize… with the >>>> saved canonize file and apply it to your data. >>>> >>>> 4. Your data should be converted to Tab-Separated-Values with the >>>> columns reordered as to be sorted in this order: Author, Created At, >>>> Annotation, Type, Comment. >>>> >>>> 5. Use the menu Text > Sort Lines… or import the resulting TSV into a >>>> spreadsheet. >>>> >>>> HTH >>>> >>>> Jean Jourdain >>>> >>>> On Monday, November 4, 2024 at 6:10:04 PM UTC+1 Howard wrote: >>>> >>>>> I think I can write the GREP code that matches the first four lines, >>>>> but I am not sure how to do that for the *Comment* lines. Also, once >>>>> I do that, how do I "write a regular expression that recognizes the sort >>>>> keys within the line"? >>>>> >>>>> I've also never used text factory. (Is it easier to use in BBEdit 15 >>>>> than in BBEdit 14?) >>>>> Howard >>>>> >>>>> On Monday 4 November 2024 at 11:53:49 am UTC-5 Neil Faiman wrote: >>>>> >>>>>> As far as I know, BBEdit simply supports sorting *lines* — not >>>>>> arbitrary records represented by batches of text lines. But do not >>>>>> despair. All is not lost. BBEdit has really robust support for sorting >>>>>> lines. >>>>>> >>>>>> I would start with a GREP that could match across multiple lines and >>>>>> collapse them into a single line, with some arbitrary separator character >>>>>> representing where the original line breaks were. (You might need two >>>>>> patterns, one to collapse the multi-line Comments into a single line, and >>>>>> then a second one to collapse all the line in the record into a single >>>>>> line. >>>>>> >>>>>> Now that each record is represented by a single line, you can write a >>>>>> regular expression that recognizes the sort keys within the line. Then >>>>>> you >>>>>> can use the “Sort using pattern“ feature of the Text > Sort Lines… >>>>>> command >>>>>> to sort the records on those keys. >>>>>> >>>>>> Finally, you can reverse the process from the first step and split >>>>>> the records back into multiple lines. >>>>>> >>>>>> Once you’ve got each of the steps perfected, you can create a text >>>>>> factory that will apply them to a file automatically, and you should be >>>>>> good to go. >>>>>> >>>>>> Good luck, >>>>>> Neil Faiman >>>>>> >>>>>> On Nov 4, 2024, at 11:35 AM, Howard <[email protected]> wrote: >>>>>> >>>>>> I have multiple records in a text file in the format below (seven >>>>>> sample records shown). I want to sort all of them by *Author* and >>>>>> then, within *Author,* *Created At*. In a record, the first four >>>>>> lines are always just one line; however, the fifth line (*Comment*) >>>>>> can be up to 30-40 lines, possibly more). >>>>>> >>>>>> Is this something that BBEdit can do? If it is, how can I do it? >>>>>> >>>>>> >>>>>> -- > This is the BBEdit Talk public discussion group. If you have a feature > request or believe that the application isn't working correctly, please > email "[email protected]" rather than posting here. Follow @bbedit on > Mastodon: <https://mastodon.social/@bbedit> > --- > You received this message because you are subscribed to the Google Groups > "BBEdit Talk" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/bbedit/0d492688-dffa-43c0-989d-98d1961bb999n%40googlegroups.com > <https://groups.google.com/d/msgid/bbedit/0d492688-dffa-43c0-989d-98d1961bb999n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "[email protected]" rather than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/bbedit/CABybPXYEoMAHCz7kXt3Z-YDukKA64Rg4gbTtCVQsZjk_CpDPbA%40mail.gmail.com.
test.textfactory
Description: Binary data
