-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 I've been working on tools to do exactly this- to make it easier for journalists to rapidly analyze documents and combine different docs and datasets (http://transparencytoolkit.org/). This mostly includes tools for collecting data (uploading docs and getting them in a standard format, scraping pages, pulling data from APIs), filtering through docs (search/browsing tools, entity extraction, combining and crossreferencing, keyword extraction), and visualizing info (in maps, timelines, network graphs).
Where possible, I've been basing these off of existing source software, but I also frequently build and heavily modify tools. I'd love to hear what suggestions people have for tools to make or use cases. More info- Demo: http://demo.transparencytoolkit.org Analysis Platform: https://github.com/TransparencyToolkit/Transparency-Toolkit All Tools: https://github.com/transparencytoolkit Network graph generated with TT from LinkedIn profiles mentioning NSA surveillance programs: http://transparencytoolkit.org/nsanetwork.html Article about the above: http://america.aljazeera.com/articles/2014/5/29/nsa-contractors-linkedinprofiles.html Thoughts on how to use tools like this effectively: https://www.theengineroom.org/how-to-find-and-mash-online-info-for-anticorruption/ On 07/08/2014 03:27 PM, grarpamp wrote: > On Tue, Jul 8, 2014 at 4:11 PM, coderman <[email protected]> > wrote: >> On Tue, Jul 8, 2014 at 1:05 PM, Griffin Boyce >> <[email protected]> wrote: >>> One approach is to take the existing public data, make some >>> assumptions (educated guesses) and do additional research on >>> top of that. It's what I'm doing right now. It's also what led >>> to the original cointelpro revelations. Before the follow-up >>> research, it was a meaningless acronym. >>> >>> Find, extrapolate, expand. >> >> this is the type of effort i was hoping to see undertaken. >> >> when you say "additional research", is this organic or >> structured? tool assisted or old skewl? >> >> i too have been building up some terms and technologies, but yet >> to put it into any structured format with context, as part of my >> post is to see how others are handling the vast complexity and >> extensive compartmentalization embodied in the leaks to date. >> >> i also would like to pursue this research anonymously, on hidden >> services rather than public sites or email. > > To do any of this you will need to collect all the releases of > docs and images to date, in their original format (not AP > newsspeak), in one place. Then dedicate much time to normalizing, > convert to one format and import into tagged document store, etc. > Yes, this could be hosted on the darknet. > - -- M. C. McGrath Transparency Toolkit | http://transparencytoolkit.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Icedove - http://www.enigmail.net/ iF4EAREIAAYFAlO8oJ0ACgkQHKENpovrR8UKmAEAhY06O24ReM52Us56SBSJZDu+ JKIjm0Juw+lG43vsxAQA/2lIAIipDU9BfYyA7+G9Uv0pwTzxhC9Ubnc7Yyd4H715 =uM9l -----END PGP SIGNATURE-----
