Hello! On Sat, Feb 7, 2009 at 2:31 PM, Hugh Glaser <h...@ecs.soton.ac.uk> wrote: > Hi Yves, > Thank you for the response. > Yes, you are right - when we have taken over the world, there will be > powerful systems to help us do this, and I can be a happy little data > provider, while others provide my search and linkage. > But when we try to tell people that we have this wonderful resource called > Musicbrainz, which is part of the amazing LOD cloud, (I think I saw evidence > of such a talk recently), what experience do the excited listeners get when > they go away and try to join? > After quite a lot of work they will have concluded, at best, that this is > system infrastructure for gurus, and so they can do a bit of browsing a bit > like wikipedia but not as pleasant, and it is not relevant to them. > I have just failed to find Telemann on Musicbrainz, I'm afraid, > (musicbrainz.org or Sindice) although I only spent a few minutes - but why so > hard?
Just typing "telemann musicbrainz" in Google led me directly to: http://musicbrainz.org/artist/8f831f50-e409-47c3-8598-71a61bc8cfb3 I don't consider that as particularly hard! > Perhaps all I wanted to do was use his URI to identify him unambiguously, > using a little tool that lets me say I (dis)like his music, but it is just so > hard. > OK, maybe my sort of use case is not what the community cares about - so be > it, but I think I should be able to do it, and do it now. > These sort of links are really valuable - there might not be so many of them, > but they can carry a lot of information. > I can tell you we have over 1M links to the dblp world from rkbexplorer, but > since the data is substantially the same, I don't consider them as valuable. > On the other hand, we have 174 links from nsf to cordis and 183 the other way > - now that is value. How did we create them? By a lot of work, and the > ability to search. > > So I agree in principle with your view of separating out these things. > But I don't think we have the time, and while we fail to deliver this, > possible recruits are turning away. > Is all this publishing work to founder because the Sindice team is not big > enough to cope, or no-one seems to be building the linkage systems, all > because the data providers do not want to offer a simple search facility? > On a side-note, there are at least three interlinkage systems I know of (Georgi's, LinkedMDB's and mine). Most dataset provide a SPARQL end-point allowing to make such specific-dataset-to-specific dataset linkage easy enough. Having a SPARQL interface makes interlinking *much* more reliable, because you know exactly what happens. If you provide me with a simple text search, I won't have any clue how your inner searching process works (are you retrieving all resources which label matches the search term? are you building an index on neighboring literals?), and I won't be able to draw satisfying interlinking conclusion. Best, y > Best > Hugh > > By the way, I am not suggesting that any identifiers such as GUIDs or PIDs > should be read by humans - more the opposite. My agent should be able to find > them easily and then ask me if that was what I meant, using words. > > On 07/02/2009 13:39, "Yves Raimond" <yves.raim...@gmail.com> wrote: > > > I think this is a really dangerous idea. Most "web-scale" identifiers, > eg Musicbrainz GUIDs and BBC PIDs are not human readable (for a lot of > reasons, and mainly because human-readable identifiers are not unique > enough!!), but both provide really easy-to-use lookup service. > Such lookups, for other sites, can be provided by semantic web search > engines. It is exactly as in the document web: web identifiers are > mostly opaque, but search engines are here to provide the help needed. > > So my proposal is: let's not confuse everything. Some people's job is > to make datasets available out there and as linked as possible to > others. Some other people make lookup services (eg Sindice), and I > think this separation of concerns works quite well. > > Best, > y > > >