On Wednesday, August 19, 2015 at 1:59:13 PM UTC+10, Lauren Kersey wrote:
>
> Hi all. I have a text tool that removes punctuation from the ends of words 
> (split). This seemed to work well enough for cleaning my corpus, but I'm 
> now working with a dataset of texts published from 1600-1700. Back then, 
> printers split a lot of lines in the middle of a word. When the printer 
> came to the end of the page, he simply split the word with a hyphen. My 
> digital copies mark this symbol as a "\u2223" and I need to remove this 
> from the word to perform computational analysis. 
>
> Is there a way to remove this character from the inside of a character 
> string? 
>

Just replace it with an empty 
string 
http://docs.julialang.org/en/latest/stdlib/strings/?highlight=replace#Base.replace

Cheers
Lex

Reply via email to