Re: [EM] Historical perspective about FairVote organization

Kristofer Munsterhjelm Mon, 18 Mar 2013 02:01:02 -0700

On 03/17/2013 06:32 PM, Richard Fobes wrote:

On 3/15/2013 2:12 AM, Kristofer Munsterhjelm wrote:

On 03/14/2013 06:45 PM, robert bristow-johnson wrote:

IRV will prevent a true spoiler (that is a candidate
with no viable chance of winning, but whose presence in the race changes
who the winner is) from spoiling the election, but if the "spoiler" and
the two leaders are all roughly equal going into the election, IRV can
fail and *has* failed (and Burlington 2009 is that example).


If you think about it, even Plurality is immune to spoilers... if the
spoilers are small enough. More specifically, if the "spoilers" have
less support in total than the difference in support between party
number one and two, Plurality is immune to them.

So instead of saying method X resists spoilers and Y doesn't, it seems
better to say that X resists larger spoilers than Y. And that raises the
question of how much spoiler-resistance you need. Plurality's result is
independent of very small spoilers. IRV's is of somewhat larger
spoilers, and Condorcet larger still (through mutual majority or
independence of Smith-dominated alternatives, depending on the method).


This is a good example of the need to _quantify_ the failure rate for
each election method for each "fairness" criteria.

Just a yes-or-no checkmark -- which is the approach in the comparison
table in the Wikipedia "Voting systems" article -- is not sufficient for
a full comparison.

Spoiler resistance is to some degree already quantified. If a methodpasses the majority criterion, then it's resistant to spoilers when aparty or candidate has a majority. A method that passes mutual majorityis resistant to spoilers outside a group that's ranked first by amajority. Independence from Smith-dominated alternatives givesresistance to spoilers not in the Smith set; and so on.

But you have a point. In the practical view, these are only interestinginasfar as they cover enough to make the method resistant againstspoilers in general. That is, if an oracle told us that to getmultiparty democracy where people don't think spoilers get in the way,all you need is ISDA and everything else is icing on the cake; then wewouldn't need to bother about anything more than ISDA. At least notunless the voters would find it unfair *on principle* to have somethingthat didn't pass, say, independence from covered alternatives.

That's the division into three I've mentioned before. Performance underhonesty, things that deter or make strategy unnecessary so we get tohonesty in the first place, and consistency with itself (or, morebroadly speaking, compliance with what the voters think should be heldfor the method to be fair).


In all three cases, we have approximations.

Bayesian regret is an approximation to performance under honesty. Itholds if you assume certain things about what performance actuallymeans: how to do interpersonal comparisons, and utilitarianism[2].

Criteria are approximations to the other two. The good thing aboutcriteria is that they provide a bound. If I prove that a method passesindependence from Smith-dominated alternatives (ISDA), then it passesISDA outright. You don't have to worry about that the method only passesISDA in the cases that are irrelevant to a real election. If it passesISDA, it passes ISDA *everywhere*[1]. And I think that's why I try tomake methods that pass many criteria, because if they pass somecriterion X, then I can say "done" and move on without having toquantify *where* they're passed. This saves a lot of detective workdetermining if the areas where they pass are the areas we care about.

But beyond that, you're right. The approximations are not the realthings. They're proxies we use because they're easier to investigate.And a method might seem to have contradictions when you look upon everypossible ballot set yet be without such in the real world. For instance,if people voted exclusively on a left-right scale, then Condorcet alwaysfinds a CW and so passes later-no-harm, later-no-help, IIA and so on, inthese cases. In that case, we could even use Borda IRV if that's whatthe people would prefer. The various monotonicity failures wouldn't be aproblem because we'd never get to that domain. And if we had some way ofknowing what level of spoiler resistance is enough (or conversely, whatisn't), then we could exclude a lot of methods for either being toocomplex or for not passing the mark.

It's like reinforcing a bridge that would collapse when a cat walks
across it, so that it no longer does so, but it still collapses when a
person walks across it. Cat resistance is not enough :-)


Great analogy. We need to start assessing _how_ _resistant_ each method
is to each "fairness" criteria.

Yes, and these fairness criteria might not even be the same sort as thetraditional criteria. They might be more vague, like "spoilerresistance", which then fails when the voters complain like they did inBurlington, and which would really be a meta-category including thingslike ISDA, mutual majority, and so on.

It would be really useful to know what level of resistance is enough,
but that data is going to be hard to gather.[...]


Indeed, that is difficult.

Perhaps one could make mock elections in some way, or a game wherecandidates distribute benefits to certain groups of voters.

Polls are reasonably good at showing behavior under honesty, I think.But one may object that they don't show adaptation to the system inquestion. Both MO and David Wetzell have used arguments of this sort,and I think there's something to it. Consider the Range polls onrangevoting's site. These show great support and variety, and use ofratings values besides min and max. On the other hand, consider YouTube,which moved from Range-style voting to Approval style. They presumablydid this because people voted min and max, although I don't know thatfor sure. If they did, then that shows that the YouTubers adapted toRange and started voting min and max.

A game or series of mock elections would have the advantage that itwould include that adaptation element. However, the pressure might notbe right. It could induce too much strategy (if the game is set up socandidates can only distribute power after each election, thus beingmaximally patronage-like). It could also induce too little. Moregenerally, we wouldn't know if we hit the realistic spot. There would beno oracle we could ask that would say "yes, with these rules, the peoplewill engage in just enough strategy that they would in a real election".Still, it would be better than nothing and we might be able to gainbounds from it. (If in the most patronage-based, most zero-sum variant,people still don't massively bury, then we know they won't in a realelection, since the real election will be more "kind". Similarly, if thevoters engaged in massive burial even in the most cooperative scenario,then we know that will be a problem in real elections too.)

 > And beyond that we have even harder questions of how much resistance
 > is needed to get a democratic system that works well. It seems
 > reasonable to me that advanced Condorcet will do, but praxeology
 > can only go so far. If only we had actual experimental data!

My VoteFair site collects lots of data. I have used it to verify that
VoteFair ranking accomplishes what it was designed to do. Not only has
such testing been useful for refining the code for the single-winner
portion (VoteFair popularity ranking, which is equivalent to the
Condorcet-Kemeny method), but such testing has revealed that VoteFair
representation ranking (which can be thought of as a two-seats-at-a-time
PR method) also works as intended.

As for praxeology ("the study of human conduct"), I also watch to see
how people try to vote strategically. The attempts are interesting, but
ineffective.

I agree that using better ballots and better vote-counting methods in
real situation -- using real data -- is essential for making real progress.

Could we use the polling data to get some information about, say,candidate variety? I think we could, at least to some extent. We couldask something like "how many elections with more than 20 voters have noCW?". I think you published stats like that once, but I don't rememberwhat the values were.

Perhaps you could also ask the voters some time later if they weresatisfied with the choice. That kind of "later polling" could uncoverBurlington-type breakdowns if there were any. If they could rank theoptions in retrospect, it would also be possible to determine whetherthey would have been satisfied with, say, IRV; but I imagine that's toomuch to ask.

[1] There are still assumptions about the input, of course. If everybodygoes on a burial spree, the Smith set may not mean what we think itmeans anymore, and then ISDA would no longer hold. Same thing withMajority Judgement and IIA. If people vote in a comparative manner, IIAno longer holds for it.

[2] I have also suggested another approximation for performance, but Ihaven't made code to implement it. It's the "games AI" approximation:you take a bunch of different games AIs (say, chess programs) and runtheir suggestions through a voting method, creating a "collective AI".The better the performance of the collective AI, the better the method.This could even be done in a "world champion vs the world" type match,where the individual "AIs" are human players. This would be a bettermetric than just using AIs, since then the various advisors could makesuggestions and thus influence the vote in a way that might improve play.


----
Election-Methods mailing list - see http://electorama.com/em for list info

Re: [EM] Historical perspective about FairVote organization

Reply via email to