Re: [mySociety:public] Fixing The Scottish Parliament Parser

Mark Longair Thu, 27 Jun 2013 09:30:08 -0700

Aidan Skinner <[email protected]> wrote:
> 
> Hi,
>
> I had a stab at fixing the Scottish Parliament Parser a while back but
> ran out of time / Scottish Parliament staff didn't seem particularly
> interested in making it not super unfriendly. Since then the only
> activity on https://github.com/mysociety/theyworkforyou/issues/89 has
> been the "this is broken" banner, but about a year ago there was a new
> script committed to parlparse that attempts to guess the daily reports
> based on their ID number.
>
> Is the remaining work basically to work out a more reliable way of
> ensuring the --daily option gets the complete data?
>
> - Aidan


Hi Aidan,

Thanks for raising this.  The new scraper should have been
running with --daily but apparently that hasn't been working;
I'll look into that now.

The bulk of the work left to be done is finishing a new parser
that takes the scraped HTML and turns it into parlparse XML.  I
did quite a bit of work on a new parser since committing the
scraper, but it's not complete yet, I'm afraid.  Unfortunately
we're very short on developer time to allocate to this at the
moment, so it's been mostly an out-of-hours effort for me in odd
moments.

I'll make sure that I get everything I've done tidied up and
committed to parlparse over the weekend so that it's in a state
where other people, such as yourself, would be able to work on
it.

Best regards,
Mark

P.S. There's a TWFY Scotland mailing list, which hasn't had any
traffic since 2008, but which is probably the most appropriate
place for discussion specific to that project:

  
https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-scotland

_______________________________________________
developers-public mailing list
[email protected]
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Unsubscribe: 
https://secure.mysociety.org/admin/lists/mailman/options/developers-public/archive%40mail-archive.com

Re: [mySociety:public] Fixing The Scottish Parliament Parser

Reply via email to