There's been some talk lately of publishing good reference implementations of 
all the election methods we talk about in various forms of description. This 
inspired me to write up my thoughts on programming election methods. It's in 
the body of the message below, and also posted as a Google Doc
https://docs.google.com/document/d/1QZqBNVYJuNIIaFxE7IUB4nPZhodFHJu2JOPNy7oBiF8/edit?hl=en_US

--

Some thoughts I’ve had over the years of writing election code for simulations 
or counting of real votes.

I assume an object or context of some sort. Usually this works out to the two 
minimum basic functions:
ElectionObject.vote( data )
ElectionObject.getResults()

The vote() Function


voteSN( map of strings to numbers )
voteNN( map of numbers to numbers )
voteA( array of numbers )

The simplest and most unambiguous vote function API is to take a mapping from 
strings (choice names) to numbers (rankings or ratings). The API should be 
unambiguous about whether the numbers are rankings (1 best, 2 ok, 3 lesser, 
etc) or ratings (high better). I usually name functions “voteRatings” or 
“voteRankings” to reinforce this.

For efficiency of storage this often internally calls a voteNN function after 
mapping choice names to choice index numbers.

Another benefit of this API is that partial ballots completion is very easy to 
represent. A choice not marked on the ballot gets no membership in the mapping 
passed to the vote() function. There might be A through F on the ballot, but my 
vote might only be {A: 1, C: 2, E: 3}

Voting a string map also makes write-ins automatically represented. On the down 
side, I have had to rearrange some election method algorithms to be able to use 
this API and transparently accept write-in choices. Curious readers should read 
my Condorcet implementation which grows the NxN tally array as choices are 
encountered.

Often the simplest vote() function to implement is one that simply takes an 
array of numbers. The fixed length array corresponds to a fixed list of 
choices. I have used this form of implementation in C code for simulation of 
thousands of elections per second, counting hundreds of thousands or millions 
of votes per second.




Getting Results


getWinner()
getResults()
textResults()
htmlResults()
htmlExplain()

If code needs to do something based on the winner, then there should be code to 
return the winner(s). Mostly I’ve needed this in simulations where I need to 
count up statistics on who won and their characteristics, or to fill in a pixel 
in a Yee Election Space diagram.
Displaying results for people to look at is simpler. Just present the available 
data in a reasonable way. There is usually a simple table of candidate names 
and a score or count of votes they got. This can be returned programattically 
as a map from names to numbers and rendered by other code. For my uses I’ve 
found it useful to have each election method know how to render its own output 
to ascii-art tables or HTML tables. Furthermore an ‘explain’ method is a good 
idea. Multi-round methods such as IRV should show the internal state on each 
round. Condorcet can show the NxN tally table.

The getResults() function might do the final calculation on whatever votes have 
been submitted by vote() up to that point. I write it so that it caches the 
results inside the context object. Calls to vote() clear that cache. Repeated 
calls to getResults() and related methods use the cached version if available.

Adavced Methods: Summability

It’s possible to have a ‘summable’ method where votes could be added up into 
many groups and those group results further added together to get the final 
result. Condorcet, Approval, Plurality and others are this way. They don’t need 
to have the full set of ballot data in the end, just a summation will do. The 
method to achieve this would look like this:
ElectionObject.addSubElection(ElectionObject sub)

On the down side, it’s hard to implement this and get it right for some 
methods, and it’s not really needed. Any modern computer could hold all the 
votes for every person on earth. There is no election too big. So, it’s a neat 
trick, and mostly interesting to election algorithm nerds, but I probably won’t 
be implementing this more than I have so far … until the next time it would 
just be too cool and needs doing.

Data Formats

To go with my preferred representation of a vote as a mapping between choice 
names and numbers, there are a couple good ways of storing that kind of data.
My favorite by far is URL encoded. nameA=number1&nameB=number2&nameC=number3

On the up side, it is unambiguous and very hard to misinterpret. (The only 
possible misinterpretation is whether the numbers are ratings or rankings.) On 
the down side, it repeats the choice name a great deal and wastes some space. 
This can be mitigated by any common compression software such as gzip/zlib, 
bzip2 or zip. They all do a very good job of compressing frequently repeated 
substrings. Another option for compressing is to include an index of names and 
votes lookup into that index.
index line: NameA=1&nameB=2&nameC=3
vote line: 1=1&2=2&3=3

URL encoding is well understood and there are high quality libraries for it in 
every modern language.

Another easy format is comma separated value, or CSV. This is a common 
import/export format of spreadsheets and databases. There is one header row of 
choice names "name a,name b,name c", and then each vote row is just the 
ranking/rating values accordingly, "1,2,3". Non-voting for a choice has to be 
represented by an empty cell, ",,1"

I know Python has a good CSV library and most other languages should have 
something too.

There are some formats exported by actually deployed election systems used by 
places like the cities of San Francisco, CA and Burlington, VT. They are 
horrendously inefficient and inconvenient to parse, and I wind up writing a 
Python script to import from them and export URL encoded votes for my own 
software to use.
----
Election-Methods mailing list - see http://electorama.com/em for list info

Reply via email to