sorry, I missed a question mark in step 2. It should be: "Created at: (.+?)<br>Author: .+?\s(.+?)<br>", Specific sub-patterns = "\2\1"
On Mon, Nov 11, 2024 at 1:39 AM Roland Küffner <[email protected]> wrote: > Hi, although this is solved, here another possible way using a text > factory. > > The ratio (text factory steps) would be this: > 1) Replace every line break with an placeholder (e.g. <br>): " > \n(?!Annotation)" => "<br>" (this features "Positional Assertions" <= > RTM) => effect: every record is on it's own line > 2) Sort lines (using "Sort using pattern") – Searching pattern = "Created > at: (.+?)<br>Author: .+\s(.+?)<br>", Specific sub-patterns = "\2\1" > 3) Re-Replace your placeholder with real line breaks: TF-step Replace all: > "<br>" => "\n" > > Caveat: this assumes: 1) "Annotation" will always be the first field in a > record; 2) "Created at" will always occur before "Author"; 3) the sequence > "<br>" will not occur in the original text (change the placeholder if so) > > It worked on your sample data, > > Regards > Roland > > > > On Wed, Nov 6, 2024 at 1:45 PM Howard <[email protected]> wrote: > >> Thanks everyone for your responses. They enabled me to solve the problem. >> Howard >> On Tuesday 5 November 2024 at 1:13:46 pm UTC-5 jj wrote: >> >>> 1. here is the find/replace dialog: >>> [image: Screenshot 2024-11-05 at 18.50.03.png] >>> 2. Here is what the file should look like once the replacement has been >>> done: >>> (the text options are displayed by clicking on the ⚙️ on the left of the >>> navigation bar) >>> [image: Screenshot 2024-11-05 at 18.49.22.png] >>> >>> Notice that the <TAB> tags where replaced by tab characters (the little >>> triangles Δ only visible when Show invisibles > Show tabs is checked). >>> >>> The conversion should be immediate. >>> >>> Transforming your example to this Tab-Separated-Values result: >>> >>> [image: Screenshot 2024-11-05 at 19.00.50.png] >>> (Notice that the invisible tab characters are visible here because Show >>> invisibles > Show tabs is checked for this file too ) >>> >>> HTH >>> >>> Jean Jourdain >>> >>> On Tuesday, November 5, 2024 at 12:37:59 PM UTC+1 Howard wrote: >>> >>>> Jean, I tried to apply your canonize_lines_to_columns.txt file to the >>>> data shown earlier in this post, following your directions; however, after >>>> letting it run for a few minutes, nothing happened. I had to Force Quit. >>>> >>>> I had to manually change every <TAB> to `\t` and the BBEdit >>>> Find/Replace wouldn't do it. In BBEdit Settings, the "Auto-expand tabs" >>>> option was off. How do I deselect that option? >>>> >>>> How long should it take for the data to be converted? What could I be >>>> doing wrong? >>>> >>>> Howard >>>> >>>> On Monday 4 November 2024 at 5:15:56 pm UTC-5 jj wrote: >>>> >>>>> Hi Howard, >>>>> >>>>> You could do that with a canonize file and a few regular espressions. >>>>> >>>>> 1. Create a new file named canonize_lines_to_columns.txt with this >>>>> content: >>>>> >>>>> # -*- x-bbedit-canon-case-sensitive: 1; x-bbedit-canon-match-words: 0; >>>>> x-bbedit-canon-grep: 1; -*- >>>>> # End: >>>>> # Local Variables: >>>>> # coding: utf-8 >>>>> # indent_style: tab >>>>> #=== >>>>> # Save the annotation number in a <<< >>> bracket that we will need >>>>> later. >>>>> >>>>> * ^Annotation (\d+):<TAB><<<\1>>>* # Replace all the column titles by >>>>> tabs. >>>>> >>>>> * \n(Created at|Author|Type|Comment):\h*<TAB>\t* # Replace all the >>>>> newlines by a space in case there are some in the contents of Comment >>>>> fields. >>>>> >>>>> * \n<TAB>\x20* # Put a newline before each annotation number and >>>>> remove the <<< >>> bracket. >>>>> >>>>> * <<<(\d+)>>><TAB>\n\1* # Put the column names in the first line. >>>>> >>>>> * \A\h*$<TAB>Annotation\tCreated At\tAuthor\tType\tComment* # Put a >>>>> single space where there is more that one. >>>>> >>>>> * \x20{2,}<TAB>\x20* # Reorder the columns to Author, Created At, >>>>> Annotation, Type, Comment. >>>>> >>>>> * ^(.+?)\t(.+?)\t(.+?)\t<TAB>\3\t\2\t\1\t* # Now the file could be >>>>> sorted in BBEdit by Author, Created At >>>>> # Or imported into a Spreadsheet as Tab Separated Values. >>>>> >>>>> 2. Once you have created this file replace in it all the <TAB> by >>>>> real tabs. Take care to deselect the "Auto-expand tabs" option on the file >>>>> before you save it otherwise they will be replaced by spaces and we need >>>>> them as separators. >>>>> >>>>> Find: <TAB> >>>>> Replace: \t >>>>> >>>>> 3. Go to your data file and use the menu Text > Canonize… with the >>>>> saved canonize file and apply it to your data. >>>>> >>>>> 4. Your data should be converted to Tab-Separated-Values with the >>>>> columns reordered as to be sorted in this order: Author, Created At, >>>>> Annotation, Type, Comment. >>>>> >>>>> 5. Use the menu Text > Sort Lines… or import the resulting TSV into a >>>>> spreadsheet. >>>>> >>>>> HTH >>>>> >>>>> Jean Jourdain >>>>> >>>>> On Monday, November 4, 2024 at 6:10:04 PM UTC+1 Howard wrote: >>>>> >>>>>> I think I can write the GREP code that matches the first four lines, >>>>>> but I am not sure how to do that for the *Comment* lines. Also, once >>>>>> I do that, how do I "write a regular expression that recognizes the sort >>>>>> keys within the line"? >>>>>> >>>>>> I've also never used text factory. (Is it easier to use in BBEdit 15 >>>>>> than in BBEdit 14?) >>>>>> Howard >>>>>> >>>>>> On Monday 4 November 2024 at 11:53:49 am UTC-5 Neil Faiman wrote: >>>>>> >>>>>>> As far as I know, BBEdit simply supports sorting *lines* — not >>>>>>> arbitrary records represented by batches of text lines. But do not >>>>>>> despair. All is not lost. BBEdit has really robust support for sorting >>>>>>> lines. >>>>>>> >>>>>>> I would start with a GREP that could match across multiple lines and >>>>>>> collapse them into a single line, with some arbitrary separator >>>>>>> character >>>>>>> representing where the original line breaks were. (You might need two >>>>>>> patterns, one to collapse the multi-line Comments into a single line, >>>>>>> and >>>>>>> then a second one to collapse all the line in the record into a single >>>>>>> line. >>>>>>> >>>>>>> Now that each record is represented by a single line, you can write >>>>>>> a regular expression that recognizes the sort keys within the line. >>>>>>> Then you can use the “Sort using pattern“ feature of the Text > Sort >>>>>>> Lines… >>>>>>> command to sort the records on those keys. >>>>>>> >>>>>>> Finally, you can reverse the process from the first step and split >>>>>>> the records back into multiple lines. >>>>>>> >>>>>>> Once you’ve got each of the steps perfected, you can create a text >>>>>>> factory that will apply them to a file automatically, and you should be >>>>>>> good to go. >>>>>>> >>>>>>> Good luck, >>>>>>> Neil Faiman >>>>>>> >>>>>>> On Nov 4, 2024, at 11:35 AM, Howard <[email protected]> wrote: >>>>>>> >>>>>>> I have multiple records in a text file in the format below (seven >>>>>>> sample records shown). I want to sort all of them by *Author* and >>>>>>> then, within *Author,* *Created At*. In a record, the first four >>>>>>> lines are always just one line; however, the fifth line (*Comment*) >>>>>>> can be up to 30-40 lines, possibly more). >>>>>>> >>>>>>> Is this something that BBEdit can do? If it is, how can I do it? >>>>>>> >>>>>>> >>>>>>> -- >> This is the BBEdit Talk public discussion group. If you have a feature >> request or believe that the application isn't working correctly, please >> email "[email protected]" rather than posting here. Follow @bbedit >> on Mastodon: <https://mastodon.social/@bbedit> >> --- >> You received this message because you are subscribed to the Google Groups >> "BBEdit Talk" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/bbedit/0d492688-dffa-43c0-989d-98d1961bb999n%40googlegroups.com >> <https://groups.google.com/d/msgid/bbedit/0d492688-dffa-43c0-989d-98d1961bb999n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "[email protected]" rather than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/bbedit/CABybPXZW3pEb%3DnL-e78XhRiz9f8KSn4z4Zmwx4fU-j%3DYfJA%2Bbg%40mail.gmail.com.
test.textfactory
Description: Binary data
