2012/7/6 Matthew Somerville <[email protected]>: > > I imagine Wikipedia will have accurate information for MP party membership. > If you collated the information (preferably indexed by the person ID that we > use on the site for ease of matching), then we can get it added easily > enough; email it to to the address given on the contact page. >
If its any help: wikipedia doesn't have a single category "MP's", but does have categories for each parliament of the UK. You can see a list of the categories at: http://dbpedia.org/page/Category:MPs_of_the_United_Kingdom_House_of_Commons,_by_Parliament You could query over all categories, but probably working Parliament by Parliament is safer. Dbpedia is very easy to use. For example the following SPARQL query gets you all the MP names (as listed in Hansard) and their party affiliations for the 2001-2005 Parliaments: PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX dbpedia-owl: <http://dbpedia.org/ontology/> PREFIX dbpprop: <http://dbpedia.org/property/> SELECT ?hansard ?party WHERE { ?mp dcterms:subject <http://dbpedia.org/resource/Category:UK_MPs_2001%E2%80%932005> . ?mp dbpedia-owl:party ?party . ?mp dbpprop:hansard ?hansard . } (I'm a SPARQL novice so I don't off-hand know how to make the category name look nicer than that). There are zillions of issues that would then need to be sorted out, eg whether the hansard property is the right one, what to do about multiple matches (in the above query, you get one line for each distinct MP, party pair so Bob Russell comes up three times of course), how to link these with MP ids from theyworkforyou (which has a nice JSON api so it shouldn't be too hard) and so on. There's also bound to be lots of missing data. A few OPTIONAL's might be needed for older MPs. I leave as an exercise for the reader :-). Attached is a python3 program that I used to test the above query. For python 2.7 you'd need to mess around with the urllib calls and all versions of python (even post 3) seem to disagree whether things should return unicode or not, but suitable encode/decodes liberally thrown around should sort that out. -- Francis Davey
mp.py
Description: Binary data
_______________________________________________ developers-public mailing list [email protected] https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public Unsubscribe: https://secure.mysociety.org/admin/lists/mailman/options/developers-public/archive%40mail-archive.com
