Given three TermDocumentMatrix, text1, text2 and text3, I'd like to
calculate word frequency for each of them into a data frame and rbind all
the data frames. Three are sample - I have hundreds in reality so I need to
functionalize this.

It's easy to calculate word freq for one TDM:

    apply(x, 1, sum)

or

    rowSums(as.matrix(x))

I want to make a list of TDMs:

    tdm_list <- Filter(function(x) is(x, "TermDocumentMatrix"), mget(ls()))

and calculate word freq for each and put it in a data frame:

    data.frame(lapply(tdm_list, sum)) # this is wrong. it simply sums
frequency of all words instead of frequency by each word.

and then rbind it all:

    do.call(rbind, df_list)

I can't figure out how to use lapply on a TDM to calculate word frequency.

Adding sample Data to play around with :

    require(tm)
    text1 <- c("apple" , "love", "crazy", "peaches", "cool", "coke",
"batman", "joker")
    text2 <- c("omg", "#rstats" , "crazy", "cool", "bananas", "functions",
"apple")
    text3 <- c("Playing", "rstats", "football", "data", "coke", "caffeine",
"peaches", "cool")

    tdm1 <- TermDocumentMatrix(Corpus(VectorSource(text1)))
    tdm2 <- TermDocumentMatrix(Corpus(VectorSource(text2)))
    tdm3 <- TermDocumentMatrix(Corpus(VectorSource(text3)))

thanks.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to