Use tm package and create a corpus to capture terms from the TDM within the corpus. Then you can apply as.matrix() to display terms' occurences. Go to CRAN and read about tm package. ________________________________ From: R-help <r-help-boun...@r-project.org> on behalf of Boris Steipe <boris.ste...@utoronto.ca> Sent: Thursday, August 3, 2017 6:40:09 PM To: Riaan Van Der Walt Cc: R lists Subject: Re: [R] find similar words in text
Please keep messages on the list so others can pitch in. _Which_ words do you want to consider identical for the purpose of frequency count? _What_ do you want to plot? B. > On Aug 3, 2017, at 4:36 PM, Riaan Van Der Walt <riaan.vanderw...@nwu.ac.za> > wrote: > > Hallo Boris, > I've loaded the Rstem, Snowball. > But I am clueless how to get a list eg. whal* (whale, whales, whaling, > whaler, whalers, whaleman, whalemen, whale-ship, whale-boat, whale's) > in the book Moby Dick and the frequency of each of the different words. > I'am usig this script: > > whales1.v <- grep("^whal.*", moby.word.v) > whales1.v > > The total occurrence for whal* is 1699. > But I can't display it or plot it. > > I am new to R and the learning curve is steep!! > > Thx! > Riaan > > > Riaan van der Walt > Tel / Phone / Mogala : 27+72+2172429 > Email / Epos / Emeile: riaan.vanderw...@nwu.ac.za > Url: http://www.nwu.ac.za/ > > >>> Boris Steipe <boris.ste...@utoronto.ca> 31 Jul 2017 23:37 >>> > You need a stemming algorithm. See here: > https://cran.r-project.org/web/views/NaturalLanguageProcessing.html > > Myself, I've had good experience with Rstem. > > B. > > > > > > > On Jul 31, 2017, at 4:47 PM, Riaan Van Der Walt > > <riaan.vanderw...@nwu.ac.za> wrote: > > > > I am new to R. > > Busy with Text Analysis. > > > > Need a script to find e.g > > > > whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick > > > > Riaan > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > <Riaan Van Der Walt.vcf> ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.