Re: [R] counting words that are contained in a list

arun Sat, 15 Feb 2014 19:37:23 -0800

Hi,

May be this helps:


vec1 <- c("victory","happiness","medal","war","service","ribbon", "dates")

vec2 <- c("The World War II Victory Medal was first issued as a service ribbon 
referred to as the Victory Ribbon.", "By 1946, a full medal had been 
established which was referred to as the World War II Victory Medal.", "The 
medal commemorates military service during World War II and is awarded to any 
member of the United States military, including members of the armed forces of 
the Government of the Philippine Islands, who served on active duty, or as a 
reservist, between December 7, 1941 and December 31, 1946","This is awarded for 
service between 7 December 1941 and 31 December 1946, both dates inclusive")
 res <-  
sort(table(factor(unlist(regmatches(tolower(vec2),gregexpr(paste(vec1,collapse="|"),vec2,ignore.case=TRUE))),levels=vec1)),decreasing=TRUE)
res
 #     war     medal   victory   service    ribbon     dates happiness 
 #       5         4         3         3         2         1         0 
res[1:5]


A.K.



Hi guys! 

I have a vector with a list of words e.g c("victory","happines"). 

I have a vector of sentence e.g. In "WWII the victory was achived by allied 
forces". 

As word victory is in my list, victory has a frequency of 1, happines 0. 

At the end I wolud like to get 5 most frequent words from my list that appear 
in sentences. 

Can you help me. 

Uros

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting words that are contained in a list

Reply via email to