But your original approach was more concise - just use unique() for the input vector (since neither length nor order matter) and the probabilities:
a <- c(1,1,1,1,2,5) set.seed(11235813) out <- sample(unique(a), 100, replace=TRUE, prob = unique(a)) out [1] 5 5 5 2 5 5 1 1 5 2 5 2 5 5 1 5 1 2 ... table(out)/length(out) # relative observations out 1 2 5 0.13 0.25 0.62 unique(a)/sum(unique(a)) # expected observations [1] 0.125 0.250 0.625 Cheers, B. On 2014-04-10, at 4:34 PM, Rui Barradas wrote: > Hello, > > Inline. > > Em 10-04-2014 21:04, Nordlund, Dan (DSHS/RDA) escreveu: >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >>> project.org] On Behalf Of Simone Gabbriellini >>> Sent: Thursday, April 10, 2014 11:59 AM >>> To: Rui Barradas >>> Cc: r-help@r-project.org >>> Subject: Re: [R] how to select an element from a vector based on a >>> probability >>> >>> Hello, Rui, >>> >>> it does, indeed! >>> >>> thanks, >>> Simone >>> >>> 2014-04-10 20:55 GMT+02:00 Rui Barradas <ruipbarra...@sapo.pt>: >>>> Hello, >>>> >>>> Use ?sample. >>>> >>>> sample(x, 1, prob = x) >>>> >> >> Just be aware that, in using this method, the probability of selection of a >> particular value will also be a function of how frequent the value is. For >> example, >> >> set.seed(7632) >> x <- c(2,2,6,2,1,1,1,3) >> table(sample(x, 10000, prob=x, replace=TRUE)) >> >> 1 2 3 6 >> 1664 3340 1696 3300 >> >> >> The probability that a vector position with a value of 1 will be selected is >> 1/18 (in this particular example). However, the probability that a value of >> 1 will be selected is 1/6 since there are three 1's. The probability of >> selecting the position with a value of 3 is 3/18. But since there is only >> one position with a value of 3, the probability of getting the value 1 on >> any given sample is equal to the probability of getting the value 3. > > You're right, I didn't notice that. One way of avoiding that problem is the > following. > > prob <- merge(x, data.frame(x=unique(x), prob=unique(x)/sum(unique(x))))$prob > sample(x, 1, prob = prob) > > Rui Barradas > >> >> >> >> >>>> Hope this helps, >>>> >>>> Rui Barradas >>>> >>>> Em 10-04-2014 19:49, Simone Gabbriellini escreveu: >>>> >>>>> Hello List, >>>>> >>>>> I have an array like: >>>>> >>>>> c(4, 3, 5, 4, 2, 2, 2, 4, 2, 6, 6, 7, 5, 5, 5, 10, 10, 11, 10, >>>>> 12, 10, 11, 9, 12, 10, 36, 35, 36, 36, 36, 35, 35, 36, 37, 35, >>>>> 35, 38, 35, 38, 36, 37, 36, 36, 37, 36, 35, 35, 36, 36, 35, 35, >>>>> 36, 35, 38, 35, 35, 35, 36, 35, 35, 35, 6, 5, 8, 6, 6, 7, 1, >>>>> 7, 7, 8, 9, 7, 8, 7, 7, 13, 13, 13, 14, 13, 13, 13, 14, 14, 15, >>>>> 15, 14, 13, 14, 39, 39, 39, 39, 39, 39, 41, 40, 39, 39, 39, 39, >>>>> 40, 39, 39, 41, 41, 40, 39, 40, 41, 40, 41, 40, 40, 40, 39, 41, >>>>> 39, 39, 39, 39, 40, 39, 39, 40, 40, 39, 39, 39, 1, 4, 3, 4) >>>>> >>>>> I would like to pick up an element with a probability proportional >>> to >>>>> the element value, thus higher values should be picked up more often >>>>> than small values (i.e., picking up 38 should be more probable than >>>>> picking up 3) >>>>> >>>>> Do you have any idea on how to code such a rich-get-richer >>> mechanism? >>>>> >>>>> Best regards, >>>>> Simone >>>>> >>>> >>> >>> >>> >> >> Dan >> >> Daniel J. Nordlund, PhD >> Research and Data Analysis Division >> Services & Enterprise Support Administration >> Washington State Department of Social and Health Services >> >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.