Quoting Lars Aronsson <[email protected]>: > Were you at the OpenKnowledge conference Saturday April 24? > I was not there, but apparently, this was the topic of some > presentations there.
No, I wasn't. > > I got introduced to the OKFN bibliographic project by > Tatiana de la O ten days earlier, at a Wikimedia meetup > in Berlin. I got the impression that publicdomainworks.net > was just one of several facets, and that the whole database > behind that was very similar to OpenLibrary. I could be > wrong about this. I didn't take notes, and can't remember > what the other domain names were that I was shown. It doesn't look very similar to OL to me, although they both have authors and titles. OL doesn't current identify public domain works, but it does link to many digitized public domain works that are open access. In that sense, a link between the two projects would bring users closer to finding the works they are looking for. Note that the Book Rights Registry that Google is creating in support of Book Search has a lot of overlap with OKFN and OL, in that it will identify the copyright status of books (which is not the same as the copyright status of works, and which I think is going to be a source of great confusion until we come up with shared terminology and shared definitions). OKFN seems to be the only one, however, looking at copyright status of resources other than books. > We don't need multiple projects with exchange of > data and a never-ending circulation of errors. > We need one centralized project, with a focus on > quality improvement. Actually, I disagree about a centralized project. I think those days are past. We should now be able to interlink projects, which will allow more freedom and innovation, and will let different folks try out different approaches. By sharing data we save time and can help each other with quality issues. It would definitely be good to have a place where all of us working with bibliographic data can hash out issues, but I don't think that has developed yet. > > On www.openlibrary.org the first thing I see is the > number of 24 million "books". You got to stop counting > all these duplicate records. You must start to focus on > quality instead of quantity. There aren't 24 million books. > Maybe half of these are duplicate records. Have you > got any idea how much junk you are carrying around? > > On the new "upstream.openlibrary.org" complete beginners > are encouraged to add books, as if adding more books > was needed. No, it's not. Removing duplicate records > is what's needed. Adding birth years and other > information to author records is also needed. Things > that add quality, not quantity. What percentage of > author records have anything more than the name? > How do we increase that? The author names come primarily from library catalogs, and there's a practice in libraries that makes sense to librarians but to no one else, AFAIK. The birth and death dates in library catalogs are used only when they are necessary to distinguish between two authors with the same name. So for every "Smith, John, 1906-" there is a "Smith, John" who was the first one entered into the catalog (and therefore no distinguishing date was needed). (However, I can find exceptions to this, as well, so it is very confusing.) I presume that library users haven't understood this (and why should they? it's not very logical from a user point of view), and probably figure that some names are without dates because the librarians didn't know them. This is just one of the things that divides libraries from their users. Once the new version of OL is available, the next step is to make it possible to merge author names, works, and editions. What merging has been done already is based on algorithms, and it appears that some data loads didn't get merged property. Solving the quality issues is very much on the task list. In terms of folks adding more books, it has been interesting to see what books have been added by individuals. I am hoping that there will be a way to identify those at some point -- because many are being added by authors outside of the US whose books often do not get much attention. Authors, of course, are highly motivated to make the existence of their books visible and willing to put in the effort. This particular aspect of the library is something I find both fascinating and encouraging. kc > > > -- > Lars Aronsson ([email protected]) > Aronsson Datateknik - http://aronsson.se > > _______________________________________________ > Ol-discuss mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss > To unsubscribe from this mailing list, send email to > [email protected] > -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 begin_of_the_skype_highlighting 1-510-435-8234 end_of_the_skype_highlighting skype: kcoylenet _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
