Thanks Ingo. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine
15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 399-1219 Skype No Voicemail please On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer <feine...@logic.at> wrote: > On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote: > > I am using code that previously worked to remove stopwords using package > "tm". > > Thanks for reporting. This is a bug in the removeWords() function in > tm version 0.5-1 available from CRAN: > > > require(tm) > > myDocument <- c("the rain in Spain", "falls mainly on the plain", "jack > and jill ran up the hill", "to fetch a pail of water") > > text.corp <- Corpus(VectorSource(myDocument)) > > ######################### > > text.corp <- tm_map(text.corp, stripWhitespace) > > text.corp <- tm_map(text.corp, removeNumbers) > > text.corp <- tm_map(text.corp, removePunctuation) > > ## text.corp <- tm_map(text.corp, stemDocument) > > text.corp <- tm_map(text.corp, removeWords, c("the", > stopwords("english"))) > > dtm <- DocumentTermMatrix(text.corp) > > dtm > > dtm.mat <- as.matrix(dtm) > > dtm.mat > > > > > dtm.mat > > Terms > > Docs falls fetch hill jack jill mainly pail plain rain ran spain the > water > > 1 0 0 0 0 0 0 0 0 1 0 1 1 > 0 > > 2 1 0 0 0 0 1 0 1 0 0 0 0 > 0 > > 3 0 0 1 1 1 0 0 0 0 1 0 0 > 0 > > 4 0 1 0 0 0 0 1 0 0 0 0 0 > 1 > > The function removeWords() fails to remove patterns at the beginning or at > the end > of a line. > > This bug is fixed in the latest development version on R-Forge, and > the fix will be included in the next CRAN release. > > Please see > > https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tm&view=markup > for a list of all bug fixes and changes between each tm version. > > Best regards, Ingo Feinerer > > -- > Ingo Feinerer > Vienna University of Technology > http://www.dbai.tuwien.ac.at/staff/feinerer > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.