Good idea! I'm trying your approach right now, but I am wondering if using str_split (package: 'stringr') or strsplit is the right way to go in terms of speed? I ran str_split over the text column of the data frame and it's processing for 2 hours now..?
I did: splittedStrings<-str_split(dataframe$text, " ") The $text column already contains cleaned text, so no double blanks etc or unnecessary symbols. Just full words. -- View this message in context: http://r.789695.n4.nabble.com/speed-issue-gsub-on-large-data-frame-tp4679747p4679904.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.