I haven't found a thread on this, but apologies if one exists! I am new to BBEdit, and am using it to clean .txt files prior to text mining. I am converting files to .txt from PDF to ensure R reads the files in correctly (I've had issues with the R PDF reader). When I do this conversion, there are often duplicates of words, appearing like "to to" or "finally finally" throughout the text. These get flagged for grammar in TextEdit and Word, but to fix it, it requires you go through the entire document manually. I have thousands of pages to go through - if I ever want to finish my dissertation, I can't do that.
I tried the Process Duplicate Lines command in BBEdit, but it did not remove duplicates of words within lines. Does anyone know if there is a way to get BBEdit to identify duplicate words, then automatically delete one of them? (or if not BBEdit, then Word or TextEdit?) Thanks! -- This is the BBEdit Talk public discussion group. If you have a feature request or believe that the application isn't working correctly, please email "[email protected]" rather than posting here. Follow @bbedit on Mastodon: <https://mastodon.social/@bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/bbedit/2a1b0304-1e5e-4e25-90f4-829fbd7b650cn%40googlegroups.com.
