Re: [EM] Monroe's Clustering Method (was PR methods and Quotas)

2011-07-28 Thread Kristofer Munsterhjelm

Andy Jennings wrote:

On Tue, Jul 26, 2011 at 12:03 AM, Kristofer Munsterhjelm wrote:

Andy Jennings wrote:


If you want a clustering PR method, then I would highly
recommend Monroe.  In Monroe, the score of each slate is equal
to the sum of each voter's score of his assigned candidate.  I
think that, for a given slate, finding the optimal way of
assigning the voters to the winners is polynomial.  (See
Warren's page for more information:
http://rangevoting.org/__MonroeMW.html
http://rangevoting.org/MonroeMW.html)

I think this is the method I called CFC-Range, for continuous
forced clustering, Range.


Thanks for the clarification.  I remember looking through your list and 
not having time to understand all of the methods you simulated.  I even 
wondered if some of them were methods I knew by a different name.


Most likely :-) If there are any you wonder what means, just ask.


This method assigns voters, or fractional voters (so we can handle
weighted votes) to one of a number of clusters, where each cluster
is assigned to a candidate, so as to maximize the score that is the
sum of scores to the assigned candidates in each cluster. This is
done subject to the constraint that each cluster must have the same
weight (sum of votes' weights), so one doesn't end up with one
cluster representing 99% of the people and others less than 1%. Then
it tries each possible slate to find the one that has a max sum, and
that one wins.


Another option, instead of using fractional voters, is to make sure each 
winner is assigned either floor(N/W) voters or ceil(N/W) voters, where N 
is the number of voters and W is the number of winners.


It seems reasonable to me that if you multiply the number of voters (or 
the weight of every ballot) by some positive value p, the outcome should 
not change. That is the case for a fractional system, but not for an 
integer one, since when you multiply the weights, the ballots can then 
be divided more finely.


Of course, others disagree. If I remember correctly, the whole thing 
that makes Young (not Kemeny, but the remove as few as possible to make 
a CW method) is the integrality constraint.



I do that assignment per slate by using linear programming, but I
guess it could be done faster by more specialized algorithms, as
Warren states. It might also be the case that Monroe's original
integer programming formulation will relax easily to linear
programming and so in practice be solvable quickly, similarly to how
I usually can find the Kemeny and Dodgson winners quickly.


I think linear programming is fine for the fractional solution, but 
Warren prefers the integral solution, so he prefers using the 
constrained-degree-subgraph problem and solutions, though you can do the 
same thing with integer programming.


I did some experiments at finding the optimal slate, for both the 
fractional case and the integral case, using a general integer 
programming package.  One thing I noticed was that the solution to the 
fractional case only divided up as many voters as was necessary, no 
more.  Almost all of the voters were assigned in whole to one of the 
candidates.  Did you notice anything similar?  This makes it seem like 
the solution to the fractional problem might be an excellent starting 
point for finding the integral solution.


I haven't checked the two, but I know that for the Young and Kemeny 
methods, the linear programming solution is often exact. The same thing 
has been seen with other problems as well: there are easy instances of 
NP-complete problems. I think some integer programming libraries solve 
the linear programming relaxation first, and then uses this as a basis 
for branch-and-bound/etc.



I also made a Kemeny version of this - actually, the Kemeny version
was the one I tried first. Here, one assigns orderings (rankings of
candidates) to each cluster, and then allocates voters to the
different clusters to maximize the Kemeny score of the assigned
ordering wrt the cluster in question. The orderings are constrained
so that there is a different first place candidate for each
cluster's ordering, and then the candidates at first place of the
orderings whose Kemeny scores' sums are maximum, win.
However, this is *very* slow, because now one doesn't only have to
find the right slate, but the right combination of orderings. Thus,
it's impractical, though it seems to work to some degree as a
multiwinner method (were it not so slow...).
Therefore, I've been interested in Condorcet methods that give
reasonable results under a variant of margins where the X against Y
score is always (number of votes where X is above Y) - (number of
voters where Y is above X), not either this or 0, whichever is more.
If I could find such a method, it might be incorporable into linear

Re: [EM] Monroe's Clustering Method (was PR methods and Quotas)

2011-07-26 Thread Kristofer Munsterhjelm

Andy Jennings wrote:

Kristofer Munsterhjelm wrote:

Andy Jennings wrote:

Like Jameson and Toby, I have spent some time thinking about how
to make a median-based PR system.

The system I came up with is similar to Jameson's, but simpler,
and uses the Hare quota!


How about clustering logic? Say you have an electorate of n voters,
and you want k seats. The method would be combinatorial: you'd check
a prospective slate. Say the slate is {ABC...}. Then that means you
make a group of n/k voters and assign A to this gorup, another group
of n/k other voters and assign B to that group, and so on.
The score of each slate is equal to the sum of the median scores for
each assigned candidate, when considering only the voters in the
assigned candidate's group.


Kristofer,

If you want a clustering PR method, then I would highly recommend 
Monroe.  In Monroe, the score of each slate is equal to the sum of each 
voter's score of his assigned candidate.  I think that, for a given 
slate, finding the optimal way of assigning the voters to the winners is 
polynomial.  (See Warren's page for more 
information: http://rangevoting.org/MonroeMW.html)


I consider it to be a very promising method.


I think this is the method I called CFC-Range, for continuous forced 
clustering, Range. This method assigns voters, or fractional voters (so 
we can handle weighted votes) to one of a number of clusters, where each 
cluster is assigned to a candidate, so as to maximize the score that is 
the sum of scores to the assigned candidates in each cluster. This is 
done subject to the constraint that each cluster must have the same 
weight (sum of votes' weights), so one doesn't end up with one cluster 
representing 99% of the people and others less than 1%. Then it tries 
each possible slate to find the one that has a max sum, and that one wins.


As an example, say that there are three clusters, and they're assigned 
to A, B, and C, respectively. Then the method assigns voters (or 
fractional voters) to cluster #1, #2, and #3 so as to maximize [the 
score for A in the first cluster plus the score for B in the second 
cluster plus the score for C in the third cluster].


I do that assignment per slate by using linear programming, but I guess 
it could be done faster by more specialized algorithms, as Warren 
states. It might also be the case that Monroe's original integer 
programming formulation will relax easily to linear programming and so 
in practice be solvable quickly, similarly to how I usually can find the 
Kemeny and Dodgson winners quickly.


I also made a Kemeny version of this - actually, the Kemeny version was 
the one I tried first. Here, one assigns orderings (rankings of 
candidates) to each cluster, and then allocates voters to the different 
clusters to maximize the Kemeny score of the assigned ordering wrt the 
cluster in question. The orderings are constrained so that there is a 
different first place candidate for each cluster's ordering, and then 
the candidates at first place of the orderings whose Kemeny scores' sums 
are maximum, win.
However, this is *very* slow, because now one doesn't only have to find 
the right slate, but the right combination of orderings. Thus, it's 
impractical, though it seems to work to some degree as a multiwinner 
method (were it not so slow...).
Therefore, I've been interested in Condorcet methods that give 
reasonable results under a variant of margins where the X against Y 
score is always (number of votes where X is above Y) - (number of voters 
where Y is above X), not either this or 0, whichever is more. If I could 
find such a method, it might be incorporable into linear programming and 
then I could make a clustering Condorcet method that would only be 
per-slate combinatorial instead of worse.


If I recall, when you did your PR simulations and graphed them on two 
axes (total satisfaction vs. proportionality), you said you weren't 
completely happy with your metric for proportionality.  I think you 
should consider using Monroe's metric.  That is, no matter how a given 
method chooses it's slate, you run the algorithm on that slate to 
optimally assign the voters to that slate and then you compute the sum 
of the voters' grades of their assigned candidates.


Yes, I could do that. I could do that for any combinatorial method, 
actually. Just like different Condorcet methods can be phrased as find 
the winner while minimizing the influence of noise type X (where X 
depends on the method), the combinatorial methods could also be 
considered find the best slate according to logic Y. There is a side 
effect, though, and that is that doing so will turn the combinatorial 
optimization method perfect (because it optimizes according to that 
logic). Because I don't want to give any method an automatic advantage, 
I would prefer the metric to be indirect - to measure something related 
to proportionality rather 

Re: [EM] Monroe's Clustering Method (was PR methods and Quotas)

2011-07-26 Thread Andy Jennings
On Tue, Jul 26, 2011 at 12:03 AM, Kristofer Munsterhjelm wrote:

 Andy Jennings wrote:


 If you want a clustering PR method, then I would highly recommend Monroe.
  In Monroe, the score of each slate is equal to the sum of each voter's
 score of his assigned candidate.  I think that, for a given slate, finding
 the optimal way of assigning the voters to the winners is polynomial.  (See
 Warren's page for more information: http://rangevoting.org/**
 MonroeMW.html http://rangevoting.org/MonroeMW.html)

 I think this is the method I called CFC-Range, for continuous forced
 clustering, Range.


Thanks for the clarification.  I remember looking through your list and not
having time to understand all of the methods you simulated.  I even wondered
if some of them were methods I knew by a different name.


 This method assigns voters, or fractional voters (so we can handle weighted
 votes) to one of a number of clusters, where each cluster is assigned to a
 candidate, so as to maximize the score that is the sum of scores to the
 assigned candidates in each cluster. This is done subject to the constraint
 that each cluster must have the same weight (sum of votes' weights), so one
 doesn't end up with one cluster representing 99% of the people and others
 less than 1%. Then it tries each possible slate to find the one that has a
 max sum, and that one wins.


Another option, instead of using fractional voters, is to make sure each
winner is assigned either floor(N/W) voters or ceil(N/W) voters, where N is
the number of voters and W is the number of winners.


 I do that assignment per slate by using linear programming, but I guess it
 could be done faster by more specialized algorithms, as Warren states. It
 might also be the case that Monroe's original integer programming
 formulation will relax easily to linear programming and so in practice be
 solvable quickly, similarly to how I usually can find the Kemeny and Dodgson
 winners quickly.


I think linear programming is fine for the fractional solution, but Warren
prefers the integral solution, so he prefers using the
constrained-degree-subgraph problem and solutions, though you can do the
same thing with integer programming.

I did some experiments at finding the optimal slate, for both the fractional
case and the integral case, using a general integer programming package.
 One thing I noticed was that the solution to the fractional case only
divided up as many voters as was necessary, no more.  Almost all of the
voters were assigned in whole to one of the candidates.  Did you notice
anything similar?  This makes it seem like the solution to the fractional
problem might be an excellent starting point for finding the integral
solution.


 I also made a Kemeny version of this - actually, the Kemeny version was the
 one I tried first. Here, one assigns orderings (rankings of candidates) to
 each cluster, and then allocates voters to the different clusters to
 maximize the Kemeny score of the assigned ordering wrt the cluster in
 question. The orderings are constrained so that there is a different first
 place candidate for each cluster's ordering, and then the candidates at
 first place of the orderings whose Kemeny scores' sums are maximum, win.
 However, this is *very* slow, because now one doesn't only have to find the
 right slate, but the right combination of orderings. Thus, it's impractical,
 though it seems to work to some degree as a multiwinner method (were it not
 so slow...).
 Therefore, I've been interested in Condorcet methods that give reasonable
 results under a variant of margins where the X against Y score is always
 (number of votes where X is above Y) - (number of voters where Y is above
 X), not either this or 0, whichever is more. If I could find such a method,
 it might be incorporable into linear programming and then I could make a
 clustering Condorcet method that would only be per-slate combinatorial
 instead of worse.


Is there some way in which the CFC-Kemeny method is better than
CFC-Range/Monroe?  It doesn't seem that useful to use the Kemeny scores that
the voters give to the ranking for the cluster they appear in only to
discard the entire ranking except for the first candidate.

- Andy

Election-Methods mailing list - see http://electorama.com/em for list info