I worked on a text mining project last semester where I had a bunch of magazines with text that was totally unstructured (from IA). I would have really liked to know how to work entity matching into such a project. Are there text mining projects out there that demonstrate doing this?
On Fri, Apr 8, 2016 at 11:08 AM, diego ferreyra <temat...@r020.com.ar> wrote: > I think controlled vocabularies can be used to improve text-minning > process, to entities recongnition (persons, institutions and critical > concepts) ... I think thats... but I'm a not neutral about this.... because > I am developer of a controlled vocabularies tool :) > > Sorry about my english :/ > > 2016-04-08 3:24 GMT-03:00 Eric Lease Morgan <emor...@nd.edu>: > > > On Apr 7, 2016, at 4:24 PM, Gregory Markus <gmar...@beeldengeluid.nl> > > wrote: > > > > >> from one of the New York Times stories on the Panama Papers: "The > > >> ICIJ made a number of powerful research tools available to the > > >> consortium that the group had developed for previous leak > > >> investigations. Those included a secure, Facebook-type forum > > >> where reporters could post the fruits of their research, as well > > >> as database search program called “Blacklight” that allowed the > > >> teams to hunt for specific names, countries or sources.” > > >> > > >> > > > http://www.nytimes.com/2016/04/06/business/media/how-a-cryptic-message-interested-in-data-led-to-the-panama-papers.html > > > > > > > > > https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration > > > > > > Based on my VERY quick read of the articles linked above, a group of > > people created a collaborative system for collecting, indexing, > searching, > > and analyzing data/information. In the end, they facilitated the creation > > of knowledge. That sure sounds like a library to me. Kudos! I believe our > > profession has many things to learn from this example, and two of those > > things include: 1) you need full text content, and 2) controlled > > vocabularies are not a necessary component of the system. —ELM > > > > > > -- > Diego Ferreyra >