Thanks Ingo.

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work, & Mobile & VoiceMail
(317) 399-1219 Skype No Voicemail please


On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer <feine...@logic.at> wrote:

> On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote:
> > I am using code that previously worked to remove stopwords using package
> "tm".
>
> Thanks for reporting. This is a bug in the removeWords() function in
> tm version 0.5-1 available from CRAN:
>
> > require(tm)
> > myDocument <- c("the rain in Spain", "falls mainly on the plain", "jack
> and jill ran up the hill", "to fetch a pail of water")
> > text.corp <- Corpus(VectorSource(myDocument))
> > #########################
> > text.corp <- tm_map(text.corp, stripWhitespace)
> > text.corp <- tm_map(text.corp, removeNumbers)
> > text.corp <- tm_map(text.corp, removePunctuation)
> > ## text.corp <- tm_map(text.corp, stemDocument)
> > text.corp <- tm_map(text.corp, removeWords, c("the",
> stopwords("english")))
> > dtm <- DocumentTermMatrix(text.corp)
> > dtm
> > dtm.mat <- as.matrix(dtm)
> > dtm.mat
> >
> > > dtm.mat
> >     Terms
> > Docs falls fetch hill jack jill mainly pail plain rain ran spain the
> water
> >    1     0     0    0    0    0      0    0     0    1   0     1   1
> 0
> >    2     1     0    0    0    0      1    0     1    0   0     0   0
> 0
> >    3     0     0    1    1    1      0    0     0    0   1     0   0
> 0
> >    4     0     1    0    0    0      0    1     0    0   0     0   0
> 1
>
> The function removeWords() fails to remove patterns at the beginning or at
> the end
> of a line.
>
> This bug is fixed in the latest development version on R-Forge, and
> the fix will be included in the next CRAN release.
>
> Please see
>
> https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tm&view=markup
> for a list of all bug fixes and changes between each tm version.
>
> Best regards, Ingo Feinerer
>
> --
> Ingo Feinerer
> Vienna University of Technology
> http://www.dbai.tuwien.ac.at/staff/feinerer
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to