> What are some of the ways to best insert Linked Data endpoints into an
> XML file?... Given a name -- say, Plato or Thoreau -- how would one go about
> identifying good endpoints? What sort of query would I send to what sort
> of "database"? What might I get back? Assuming my goal is to enrich the
> text, what sort of link(s) should I insert into my XML?


Thank you for the helpful replies.

When and if I do this work, I think I will use DBpedia and their lookup 
service. [1] Here's how:

  * do named-entity recognition (NER) against my documents
  * for each name, place or organization element in the resulting XML
    o query DBpedia for URIs via their lookup service
    o add 1 or more of the resulting URIs as attributes
      of the name, place, or organization element
  * end for

Once done I could use the enhanced XML file as the raw source for providing 
cool (and "kewl") services against the text -- word clouds, definitions, 
geo-locations, images, abstracts, find similar,purchase, print, do concordance 
against, etc.
    
In the meantime, if I want to disambiguate I could go any number of routes. For 
example, I could crowd source the XML file allowing people to select the 
"correct" URI from each attribute listing. Alternatively, I could probably look 
for relationships between all the URIs in all the attributes and somehow 
statistically select the "correct" one. Whatever.

So much of library work is spent "cataloging" things and trying to make them 
findable. I sincerely believe most people don't think this is a very relevant 
service these days. And I don't know about you, but I certainly don't feel 
starved for information. Instead, I think people want to make better use of the 
content they have, and enriching texts in the way outlined above may be one way 
of going about it.

[1] lookup service - http://bit.ly/jbg0I6

-- 
Eric Lease Morgan
University of Notre Dame

Reply via email to