Hi, May be this helps: vec<- "this is a nice text with nice characters" library(stringr) vec2<-unlist(str_match_all(vec,"\\w+"))
#or # vec2<-str_split(vec," ")[[1]] res<-unique(lapply(vec2,function(x) which(!is.na(match(vec2,x))))) names(res)<- unique(vec2) res #$this #[1] 1 # #$is #[1] 2 # #$a #[1] 3 # #$nice #[1] 4 7 # #$text #[1] 5 # #$with #[1] 6 # #$characters #[1] 8 A.K. >Hi, >I have tried some different packages in order to build a R program which will take as input a text file, produce a list of the words inside that file. Each >word should have a vector with all the places that this word exist in the file. >As an example, if the text file has the string: > >"this is a nice text with nice characters" > >The output should be something like: >$this >[1] 1 >$is >[1] 2 >$a >[1] 3 >$nice >[1] 4 7 >$text >[1] 5 >$with >[1] 6 >$characters >[1] 8 >A useful post which i came across here was >http://r.789695.n4.nabble.com/Memory-usage-in-R-grows-considerably-while-calculating-word-frequencies-td4644053.html > . However it doesnt include the positions of each words. >A similar function which i found through the documentation i guess it's the "str_locate", however i want to count "words" and not "characters". >Any guidance of what packages / techniques to use on that, would be really >appreciated >Thank you. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.