[EM] Fast Condorcet-Kemeny calculation times, clarification of NP-hardness issue

2012-03-04 Thread Richard Fobes
Finally, after reading the articles cited by Warren Smith (listed at the 
bottom of this reply) plus some related articles, I can reply to his 
insistence that Condorcet-Kemeny calculations take too long to 
calculate.  Also, this reply addresses the same claim that appears in 
Wikipedia both in the "Kemeny-Young method" article and in the 
comparison table within the Wikipedia "Voting systems" article (in the 
"polynomial time" column that Markus Schulze added).


One source of confusion is that Warren, and perhaps others, regard the 
Condorcet-Kemeny problem as a "decision problem" that only has a "yes" 
or "no" answer.  This view is suggested by Warren's reference (below and 
in other messages) to the problem as being NP-complete, which only 
applies to decision problems.  Although it is possible to formulate a 
decision problem based on one or more specified characteristics of the 
Condorcet-Kemeny method, that is a different problem than the 
Condorcet-Kemeny problem.


In the real world of elections, the Condorcet-Kemeny problem is to 
calculate a ranking of all choices (e.g. candidates) that maximizes the 
sequence score (or minimizes the "Kemeny score").


Clearly the Condorcet-Kemeny problem is an optimization problem, not a 
decision problem (and not a search problem).  It is an optimization 
problem because we have a way to measure how closely the solution 
reaches its goal.


(For contrast, consider the NP-hard "subset sum problem" in which the 
goal is to determine whether a specified list of integers contains a 
subset that can be added and/or subtracted to yield zero.  Any subset 
either sums to zero or it doesn't sum to zero.  This makes it easy to 
formulate the related decision (yes/no) problem that asks whether such a 
subset exists for a given set of numbers.)


Because the Condorcet-Kemeny problem is an optimization problem, the 
solution to the Condorcet-Kemeny problem can be an approximation.  If 
this approach is used, it becomes relevant to ask how closely the 
approximation reaches the ranking that has the highest sequence score. 
Yet even this question -- of "how close?" -- is not a decision problem 
(because it goes beyond a yes or no answer).


Keeping in mind that VoteFair popularity ranking calculations are 
mathematically equivalent to the Condorcet-Kemeny method, my claim is 
that VoteFair popularity ranking calculations yield, at the least, the 
same top-ranked choice, and the same few top-ranked choices, as the 
solution produced by examining every sequence score -- except (and this 
is the important part) in cases where the voter preferences are so 
convoluted that any top-ranked choice and any few top-ranked choices 
would be controversial.  As one academic paper elegantly put it: 
"garbage in, garbage out".


More specifically, here is a set of claims that more rigorously state 
the above ambiguous claim.


Claim 1: For _some_ _instances_, a polynomial-time calculation can 
identify the full ranking that produces the highest Condorcet-Kemeny 
sequence score.


Claim 2: For _some_ _instances_, a polynomial-time calculation can rank 
the top most-popular candidates/choices and this partial ranking will be 
the same as the top portion of the full ranking as determined by 
identifying the highest Condorcet-Kemeny sequence score.


Claim 3: For the _remaining_ _instances_ (not covered in Claims 1 and 
2), an approximation of the full Condorcet-Kemeny ranking can be 
calculated in polynomial time.


Claim 4: For any cases in which the top-ranked candidate/choice 
according to the VoteFair popularity ranking algorithm differs from the 
top-ranked candidate/choice according to a full calculation of all 
sequence scores, the outcome of a runoff election between the two 
candidates/choices would be difficult to predict.


As done in the academic literature, I am excluding the cases in which 
more than one sequence has the same highest sequence score.


To help clarify the validity of these claims, I'll use an analogy.

Consider a special case of the rigorously studied Traveling Salesman 
Problem (TSP), which is NP-hard to solve.  (The TSP also can be 
expressed as a decision problem, in which case the decision problem is 
NP-complete, but that variation is not the problem discussed here.)


The special case -- which I will refer to as the non-returning Traveling 
Salesman Problem -- is that we want to know which city the salesman 
visits first, and we want to know, with successively less interest, 
which city the salesman visits second, third, and so on.  Additionally, 
for this special case, we specify that the cities to be visited are 
roughly located between a beginning point "B" and and ending point "E".


To make this special case mathematically equivalent to the normal 
Traveling Salesman Problem in which the salesman returns to the starting 
city, we create a path of closely spaced cities (labeled "+" below) that 
lead back to the starting city "B".


Here is a diagram of

Re: [EM] Fast Condorcet-Kemeny calculation times, clarification of NP-hardness issue

2012-03-04 Thread Warren Smith
On Sun, Mar 4, 2012 at 3:44 PM, Richard Fobes
 wrote:
> Finally, after reading the articles cited by Warren Smith (listed at the
> bottom of this reply) plus some related articles, I can reply to his
> insistence that Condorcet-Kemeny calculations take too long to calculate.
>  Also, this reply addresses the same claim that appears in Wikipedia both in
> the "Kemeny-Young method" article and in the comparison table within the
> Wikipedia "Voting systems" article (in the "polynomial time" column that
> Markus Schulze added).
>
> One source of confusion is that Warren, and perhaps others, regard the
> Condorcet-Kemeny problem as a "decision problem" that only has a "yes" or
> "no" answer.  This view is suggested by Warren's reference (below and in
> other messages) to the problem as being NP-complete, which only applies to
> decision problems.  Although it is possible to formulate a decision problem
> based on one or more specified characteristics of the Condorcet-Kemeny
> method, that is a different problem than the Condorcet-Kemeny problem.

--the optimization problem is at least as hard as the decision
problem.You are erroneously creating the impression I somehow
was unaware of this, or that you somehow have here got some new
insight.  Neither is true.



> In the real world of elections, the Condorcet-Kemeny problem is to calculate
> a ranking of all choices (e.g. candidates) that maximizes the sequence score
> (or minimizes the "Kemeny score").
>
> Clearly the Condorcet-Kemeny problem is an optimization problem, not a
> decision problem (and not a search problem).  It is an optimization problem
> because we have a way to measure how closely the solution reaches its goal.
>
> (For contrast, consider the NP-hard "subset sum problem" in which the goal
> is to determine whether a specified list of integers contains a subset that
> can be added and/or subtracted to yield zero.  Any subset either sums to
> zero or it doesn't sum to zero.  This makes it easy to formulate the related
> decision (yes/no) problem that asks whether such a subset exists for a given
> set of numbers.)




> Because the Condorcet-Kemeny problem is an optimization problem, the
> solution to the Condorcet-Kemeny problem can be an approximation.  If this
> approach is used, it becomes relevant to ask how closely the approximation
> reaches the ranking that has the highest sequence score. Yet even this
> question -- of "how close?" -- is not a decision problem (because it goes
> beyond a yes or no answer).
>
> Keeping in mind that VoteFair popularity ranking calculations are
> mathematically equivalent to the Condorcet-Kemeny method, my claim is that
> VoteFair popularity ranking calculations yield, at the least, the same
> top-ranked choice, and the same few top-ranked choices, as the solution
> produced by examining every sequence score -- except (and this is the
> important part) in cases where the voter preferences are so convoluted that
> any top-ranked choice and any few top-ranked choices would be controversial.
>  As one academic paper elegantly put it: "garbage in, garbage out".
>
> More specifically, here is a set of claims that more rigorously state the
> above ambiguous claim.
>
> Claim 1: For _some_ _instances_, a polynomial-time calculation can identify
> the full ranking that produces the highest Condorcet-Kemeny sequence score.

--oh whoo-whee.   Here's another claim: for SOME planets, I can
readily find a million dollars in gold piled up right next to me.

> Claim 2: For _some_ _instances_, a polynomial-time calculation can rank the
> top most-popular candidates/choices and this partial ranking will be the
> same as the top portion of the full ranking as determined by identifying the
> highest Condorcet-Kemeny sequence score.
>
> Claim 3: For the _remaining_ _instances_ (not covered in Claims 1 and 2), an
> approximation of the full Condorcet-Kemeny ranking can be calculated in
> polynomial time.

--what kind of "approximation"?  I can find an "approximation" to
a million dollars in gold, namely, 1 penny.

> Claim 4: For any cases in which the top-ranked candidate/choice according to
> the VoteFair popularity ranking algorithm differs from the top-ranked
> candidate/choice according to a full calculation of all sequence scores, the
> outcome of a runoff election between the two candidates/choices would be
> difficult to predict.
>
> As done in the academic literature, I am excluding the cases in which more
> than one sequence has the same highest sequence score.

--I'm not sure what that meant, but it sounds like garbage too.

> To help clarify the validity of these claims, I'll use an analogy.
>
> Consider a special case of the rigorously studied Traveling Salesman Problem
> (TSP), which is NP-hard to solve.  (The TSP also can be expressed as a
> decision problem, in which case the decision problem is NP-complete, but
> that variation is not the problem discussed here.)
>
> The special case -- which I will refer to as t

Re: [EM] Fast Condorcet-Kemeny calculation times, clarification of NP-hardness issue

2012-03-04 Thread Andrew Myers

On 3/4/12 5:44 PM, Warren Smith wrote:

On Sun, Mar 4, 2012 at 3:44 PM, Richard Fobes
  wrote:

Finally, after reading the articles cited by Warren Smith (listed at the
bottom of this reply) plus some related articles, I can reply to his
insistence that Condorcet-Kemeny calculations take too long to calculate.
  Also, this reply addresses the same claim that appears in Wikipedia both in
the "Kemeny-Young method" article and in the comparison table within the
Wikipedia "Voting systems" article (in the "polynomial time" column that
Markus Schulze added).

One source of confusion is that Warren, and perhaps others, regard the
Condorcet-Kemeny problem as a "decision problem" that only has a "yes" or
"no" answer.  This view is suggested by Warren's reference (below and in
other messages) to the problem as being NP-complete, which only applies to
decision problems.  Although it is possible to formulate a decision problem
based on one or more specified characteristics of the Condorcet-Kemeny
method, that is a different problem than the Condorcet-Kemeny problem.

--the optimization problem is at least as hard as the decision
problem.You are erroneously creating the impression I somehow
was unaware of this, or that you somehow have here got some new
insight.  Neither is true.
I might try to say all this in a more friendly way than Warren does, but 
he is 100% right about all the technical issues here. This is basic 
computer science. Nothing fancy and no judgment calls are involved.


-- Andrew

<>
Election-Methods mailing list - see http://electorama.com/em for list info


Re: [EM] Fast Condorcet-Kemeny calculation times, clarification of NP-hardness issue

2012-03-04 Thread Jameson Quinn
Remember that K-Y is Condorcet compliant, and I believe consistent as well
(that is, you can generalize from Condorcet to Smith and even ignore the
effects of the candidates outside the Smith set). So the set of cases where
the problem is easy is at least somewhat well-defined in that sense, and
probably includes (for instance) all the elections I've ever voted in. And
so I believe that Richard's "my algorithm usually gets it right" is not
totally content-free.

But Warren is also very right that, when things start to go wrong, it's
impossible to know whether you have the right answer, the almost-right
answer, or something totally wrong. And if some algorithm said "Jameson's
favorite is probably not the winner, though I can't prove it", I don't
think I'd be very inclined to accept that "probably"on Richard's say-so.
Part of the whole point of democracy is to guarantee that there will be a
clear winner with some legitimacy, since a civil war is usually a lot worse
than the second-best option.

Jameson

2012/3/4 Andrew Myers 

> On 3/4/12 5:44 PM, Warren Smith wrote:
>
>> On Sun, Mar 4, 2012 at 3:44 PM, Richard Fobes
>>   wrote:
>>
>>> Finally, after reading the articles cited by Warren Smith (listed at the
>>> bottom of this reply) plus some related articles, I can reply to his
>>> insistence that Condorcet-Kemeny calculations take too long to calculate.
>>>  Also, this reply addresses the same claim that appears in Wikipedia
>>> both in
>>> the "Kemeny-Young method" article and in the comparison table within the
>>> Wikipedia "Voting systems" article (in the "polynomial time" column that
>>> Markus Schulze added).
>>>
>>> One source of confusion is that Warren, and perhaps others, regard the
>>> Condorcet-Kemeny problem as a "decision problem" that only has a "yes" or
>>> "no" answer.  This view is suggested by Warren's reference (below and in
>>> other messages) to the problem as being NP-complete, which only applies
>>> to
>>> decision problems.  Although it is possible to formulate a decision
>>> problem
>>> based on one or more specified characteristics of the Condorcet-Kemeny
>>> method, that is a different problem than the Condorcet-Kemeny problem.
>>>
>> --the optimization problem is at least as hard as the decision
>> problem.You are erroneously creating the impression I somehow
>> was unaware of this, or that you somehow have here got some new
>> insight.  Neither is true.
>>
> I might try to say all this in a more friendly way than Warren does, but
> he is 100% right about all the technical issues here. This is basic
> computer science. Nothing fancy and no judgment calls are involved.
>
> -- Andrew
>
>
> 
> Election-Methods mailing list - see http://electorama.com/em for list info
>
>

Election-Methods mailing list - see http://electorama.com/em for list info


Re: [EM] Fast Condorcet-Kemeny calculation times, clarification of NP-hardness issue

2012-03-04 Thread Kristofer Munsterhjelm

On 03/05/2012 02:49 AM, Jameson Quinn wrote:

Remember that K-Y is Condorcet compliant, and I believe consistent as
well (that is, you can generalize from Condorcet to Smith and even
ignore the effects of the candidates outside the Smith set). So the set
of cases where the problem is easy is at least somewhat well-defined in
that sense, and probably includes (for instance) all the elections I've
ever voted in. And so I believe that Richard's "my algorithm usually
gets it right" is not totally content-free.

But Warren is also very right that, when things start to go wrong, it's
impossible to know whether you have the right answer, the almost-right
answer, or something totally wrong. And if some algorithm said
"Jameson's favorite is probably not the winner, though I can't prove
it", I don't think I'd be very inclined to accept that "probably"on
Richard's say-so. Part of the whole point of democracy is to guarantee
that there will be a clear winner with some legitimacy, since a civil
war is usually a lot worse than the second-best option.


It seems a very simple way exists to clear all of this up. Namely: take 
Fobes's algorithm (if it's been implemented somewhere). Also take an 
algorithm known to give the Kemeny ordering, if NP-hard -- e.g. 
something like the integer linear programming formulation my voting 
simulator uses. Then run them with millions of random ballot set 
instances. If there's at least one ballot set where the "known good" 
algorithm gives an ordering with a better Kemeny score (sum of number of 
voters agreeing with the pairwise contests), then we're done, that 
proves that Fobes's algorithm is not Kemeny. If there are none, then 
that's weird -- it would give a statement of "Fobes's algorithm usually 
agrees with Kemeny when there are less than n candidates and v voters" 
-- and if n and v are small enough, it can even be exhaustively checked.



Election-Methods mailing list - see http://electorama.com/em for list info