> On Jan 6, 2015, at 3:29 PM, Monnand <monn...@gmail.com> wrote: > > Thank you, all! Your replies are very useful, especially Don's explanation! > > One complaint I have is: the function name (talbe) is really not very > informative.
Why not? You used the word 'table' in your original post, except as Don noted, you were overthinking the problem. The basic concept is a tabulation of discrete values in a vector, which is a basic analytic method. Using commands like: ??table ??frequency would have led you to the table() function, as well as others. Believe it or not, taking a few minutes to have read/searched "An Introduction to R", which is the basic R manual, would have led you to the same solution: http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Frequency-tables-from-factors Regards, Marc Schwartz > > On Sun Jan 04 2015 at 5:03:47 PM MacQueen, Don <macque...@llnl.gov> wrote: > >> This seems to me to be a case where thinking in terms of computer >> programming concepts is getting in the way a bit. Approach it as a data >> analysis task; the S language (upon which R is based) is designed in part >> for data analysis so there is a function that does most of the job for you. >> >> (I changed your vector of strings to make the result more easily >> interpreted) >> >>> x = c("1", "1", "2", "1", "5", "2",'3','5','5','2','2') >>> tmp <- table(x) ## counts the number of appearances of each element >>> tmp[tmp==max(tmp)] ## finds which one occurs most often >> 2 >> 4 >> >> Meaning that the element '2' appears 4 times. The table() function should >> be fast even with long vectors. Here's an example with a vector of length >> 1 million: >> >> foo <- table( sample(letters, 1e6, replace=TRUE) ) >> >> >> One of the seminal books on the S language is John M Chambers' Programming >> with Data -- and I would emphasize the "with Data" part of that title. >> >> -- >> >> Don MacQueen >> >> Lawrence Livermore National Laboratory >> 7000 East Ave., L-627 >> Livermore, CA 94550 >> 925-423-1062 >> >> >> >> >> >> On 1/4/15, 1:02 AM, "Monnand" <monn...@gmail.com> wrote: >> >>> Hi all, >>> >>> I thought this was a very naive problem but I have not found any solution >>> which is idiomatic to R. >>> >>> The problem is like this: >>> >>> Assuming we have vector of strings: >>> x = c("1", "1", "2", "1", "5", "2") >>> >>> We want to count number of appearance of each string. i.e. in vector x, >>> string "1" appears 3 times; "2" appears twice and "5" appears once. Then I >>> want to know which string is the majority. In this case, it is "1". >>> >>> For imperative languages like C, C++ Java and python, I would use a hash >>> table to count each strings where keys are the strings and values are the >>> number of appearance. For functional languages like clojure, there're >>> higher order functions like group-by. >>> >>> However, for R, I can hardly find a good solution to this simple problem. >>> I >>> found a hash package, which implements hash table. However, installing a >>> package simple for a hash table is really annoying for me. I did find >>> aggregate and other functions which operates on data frames. But in my >>> case, it is a simple vector. Converting it to a data frame may be not >>> desirable. (Or is it?) >>> >>> Could anyone suggest me an idiomatic way of doing such job in R? I would >>> be >>> appreciate for your help! >>> >>> -Monnand ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.