Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-26 Thread Rainer M Krug
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/09/12 01:29, mcelis wrote: > I am working with some large text files (up to 16 GBytes). I am interested > in extracting the > words and counting each time each word appears in the text. I have written a > very simple R > program by following

Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-25 Thread arun
.txt,sep="\t")  cat("Word\tFREQ",words.txt,file="frequencies",sep="\n") }) #Read 4 items   # user  system elapsed  # 0.148   0.000   0.150 There is improvement in the speed.  Output also looked similar.  This code may be still improved. A.K.   

Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-25 Thread Milan Bouchet-Valat
Le lundi 24 septembre 2012 à 16:29 -0700, mcelis a écrit : > I am working with some large text files (up to 16 GBytes). I am interested > in extracting the words and counting each time each word appears in the > text. I have written a very simple R program by following some suggestions > and examp

Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-25 Thread Martin Maechler
elapsed >  # 0.036   0.008   0.043 > A.K. Well, dear A.K., your definition of "word" is really different, and in my view clearly much too simplistic, compared to what the OP (= original-poster) asked from. E.g., from the above paragraph, your method will get words such as &qu

Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-24 Thread arun
ower(txt1),"\\s"))),decreasing=TRUE)  words.txt<-paste(names(words.txt),words.txt,sep="\t")  cat("Word\tFREQ",words.txt,file="frequencies",sep="\n") }) #Read 4 items  #user  system elapsed  # 0.036   0.008   0.043 A.K. - Original M

Re: [R] Memory usage in R grows considerably while calculating word frequencies

2012-09-24 Thread arun
;)),decreasing=TRUE)  words.txt<-paste(names(words.txt),words.txt,sep="\t")  cat("Word\tFREQ",words.txt,file="frequencies",sep="\n") }) #  user  system elapsed  # 0.016   0.000   0.014  A.K. - Original Message - From: mcelis To: r-hel

[R] Memory usage in R grows considerably while calculating word frequencies

2012-09-24 Thread mcelis
I am working with some large text files (up to 16 GBytes). I am interested in extracting the words and counting each time each word appears in the text. I have written a very simple R program by following some suggestions and examples I found online. If my input file is 1 GByte, I see that R us