Re: [mySociety:public] diffing the DEbill

stef Mon, 26 Apr 2010 14:51:45 -0700

On Mon, Apr 26, 2010 at 10:43:26PM +0100, Richard Boulton wrote:
> [...]


RE: ocr-ing, we transcribed it in a couple of hours back then. ;)

> Once you have the two documents, I assume you'd want, for each piece
> of text marked by a footnote in the official document, to find similar
> sections in the leaked document, and the footnotes from those pieces
> of text.  This could probably done with a fair degree of success by a
> standard text similarity search algorithm (and then checked manually).

we have something like this ready. not for public use, but we useful in
trained hands. we're working on going public with this soon.

-- 
gpg: https://www.ctrlc.hu/~stef/stef.gpg
gpg fp: F617 AC77 6E86 5830 08B8  BB96 E7A4 C6CF A84A 7140

_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Re: [mySociety:public] diffing the DEbill

Reply via email to