On Mon, Apr 26, 2010 at 10:43:26PM +0100, Richard Boulton wrote: > [...]
RE: ocr-ing, we transcribed it in a couple of hours back then. ;) > Once you have the two documents, I assume you'd want, for each piece > of text marked by a footnote in the official document, to find similar > sections in the leaked document, and the footnotes from those pieces > of text. This could probably done with a fair degree of success by a > standard text similarity search algorithm (and then checked manually). we have something like this ready. not for public use, but we useful in trained hands. we're working on going public with this soon. -- gpg: https://www.ctrlc.hu/~stef/stef.gpg gpg fp: F617 AC77 6E86 5830 08B8 BB96 E7A4 C6CF A84A 7140 _______________________________________________ Mailing list [email protected] Archive, settings, or unsubscribe: https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
