On Wednesday, August 19, 2015 at 1:59:13 PM UTC+10, Lauren Kersey wrote: > > Hi all. I have a text tool that removes punctuation from the ends of words > (split). This seemed to work well enough for cleaning my corpus, but I'm > now working with a dataset of texts published from 1600-1700. Back then, > printers split a lot of lines in the middle of a word. When the printer > came to the end of the page, he simply split the word with a hyphen. My > digital copies mark this symbol as a "\u2223" and I need to remove this > from the word to perform computational analysis. > > Is there a way to remove this character from the inside of a character > string? >
Just replace it with an empty string http://docs.julialang.org/en/latest/stdlib/strings/?highlight=replace#Base.replace Cheers Lex