[R] Allocation of data points to groups based on membership probabilities

Michael Haenlein Thu, 15 Sep 2011 05:55:24 -0700

Dear all,

I have a matrix that provides, for a series of data points, the probability
that each of these points belongs to a certain group.
Take the following example, which represents 20 data points and their group
membership probability to five groups (A-E):


set.seed(1)
probs <- matrix(runif(100),nrow=20,
dimnames=list(c(),c("A","B","C","D","E")))

In addition  know how large each group should be.
Assume for example, that the groups sizes in the aforementioned example are
5, 4, 1, 6, 4 for A, B, C, D and E respectively.

I would like to allocate individuals to the groups so that
(a) each group has the size it is supposed to have and
(b) all data points are part of the group where they have a high probability
of belonging.

For some data points this allocation is straightforward, because one group
membership probability is much larger than the others.
But for others two or more probabilities are very similar which means that a
datapoint could be allocated to either one or the other group.

I guess it should be possible to write some iterative code or an
optimization routine that can do what I would like to do, but I do not know
how.

Does anyone have an idea how this could be done?

Thanks very much in advance,

Michael Haenlein



Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris, France

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Allocation of data points to groups based on membership probabilities

Reply via email to