Aidan Skinner <[email protected]> wrote: > > Hi, > > I had a stab at fixing the Scottish Parliament Parser a while back but > ran out of time / Scottish Parliament staff didn't seem particularly > interested in making it not super unfriendly. Since then the only > activity on https://github.com/mysociety/theyworkforyou/issues/89 has > been the "this is broken" banner, but about a year ago there was a new > script committed to parlparse that attempts to guess the daily reports > based on their ID number. > > Is the remaining work basically to work out a more reliable way of > ensuring the --daily option gets the complete data? > > - Aidan
Hi Aidan, Thanks for raising this. The new scraper should have been running with --daily but apparently that hasn't been working; I'll look into that now. The bulk of the work left to be done is finishing a new parser that takes the scraped HTML and turns it into parlparse XML. I did quite a bit of work on a new parser since committing the scraper, but it's not complete yet, I'm afraid. Unfortunately we're very short on developer time to allocate to this at the moment, so it's been mostly an out-of-hours effort for me in odd moments. I'll make sure that I get everything I've done tidied up and committed to parlparse over the weekend so that it's in a state where other people, such as yourself, would be able to work on it. Best regards, Mark P.S. There's a TWFY Scotland mailing list, which hasn't had any traffic since 2008, but which is probably the most appropriate place for discussion specific to that project: https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-scotland _______________________________________________ developers-public mailing list [email protected] https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public Unsubscribe: https://secure.mysociety.org/admin/lists/mailman/options/developers-public/archive%40mail-archive.com
