Re: [ol-discuss] Daisy Not Working
I tried and got the message: DAISY creation failed Is that the message that you are getting? Can you post the exact link that gets your error? kc On 12/18/17 5:49 AM, Roger Loran Bailey wrote: > Right now every time I try to download a protected Daisy book I am > getting an error page telling me that the page cannot be found. > ___ > Ol-discuss mailing list - Ol-discuss@archive.org > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss > Archives: http://www.mail-archive.com/ol-discuss@archive.org/ > To unsubscribe from this mailing list, send email to > ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 (Signal) skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1
On 7/2/15 4:35 AM, Charles Horn wrote: I'm worried now that getting updates released might be a much harder goal that getting code merged if there aren't any IA devs to oversee the release process, or support if something should go wrong. Merging code on github is one thing, getting it released sounds like it could be close to impossible if there isn't a currently functioning pipeline. There's a `production` branch in github that is very far behind the current master (last update 2011!), I'm not sure exactly what code is in production as of now, but I thought it had been updated since 2011? One of the barriers that I'm aware of is that Anand had the only test version of OL, and possibly the only test suite. I haven't looked at the github repo, but if there isn't a test suite, moving anything into production is pretty risky. that said, I have no idea what the IA culture is for testing. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] OpenLibrary cleanup : question about True Names by Vernor Vinge
Hi. The messed up titles from Amazon are usually from third-party sellers who take a lot of shortcuts, putting all of the information in the title. Please edit those! Thanks. Presumably that will bring the books together (but I don't know if that merging software is currently running). As for what is a different edition -- this is not an easy question. Texts get reprinted often, especially those in the public domain, but each new act on the part of a publisher is considered a new edition, because you don't know what might have changed (new preface, added illustrations...). For modern books (after about 1970) the shortcut answer is: if it has a different ISBN, it is a different edition. kc On 12/6/14 8:47 PM, earthfu...@yahoo.ca wrote: Request cleanup of Vernor Vinge listings https://openlibrary.org/search?title=namesauthor=Vernor+Vinge 1) True Names: And the Opening of the Cyberspace Frontier by Vernor Vinge 4 editions - first published in 1984 2) True Names...and Other Dangers by Vernor Vinge 1 edition - first published in 1987 *** Should the works/editions above be made in editions of True Names book? (one edition of plain True Names book below) http://www.amazon.co.uk/True-Names-Opening-Cyberspace-Frontier/dp/031296/ 384 pages http://ecx.images-amazon.com/images/I/51PX9d-HOaL._AA160_.jpg image http://ecx.images-amazon.com/images/I/51PX9d-HOaL._AA160_.jpg http://ecx.images-amazon.com/images/I/51PX9d-HOaL._AA160_.jpg View on ecx.images-amazon.com http://ecx.images-amazon.com/images/I/51PX9d-HOaL._AA160_.jpg Preview by Yahoo *** regarding title of : 1) True Names: And the Opening of the Cyberspace Frontier by Vernor Vinge 4 editions - first published in 1984 True Names: And the Opening of the Cyberspace Frontier was first published in 2001 by Tor and contained True Names by Vinge PLUS essays by other people writing about Vinge's story. http://www.amazon.com/True-Names-Opening-Cyberspace-Frontier/dp/B001PO6ANG/ *** regarding title of : 2) True Names...and Other Dangers by Vernor Vinge 1 edition - first published in 1987 Amazon says 275 pages, so it seems to be a straightforward reprint of True Names http://www.amazon.com/Names-Other-Dangers-Vernor-Vinge/dp/0671653636/ http://ecx.images-amazon.com/images/I/51MSe9G8gnL._AA160_.jpg image http://ecx.images-amazon.com/images/I/51MSe9G8gnL._AA160_.jpg http://ecx.images-amazon.com/images/I/51MSe9G8gnL._AA160_.jpg View on ecx.images-amazon.com http://ecx.images-amazon.com/images/I/51MSe9G8gnL._AA160_.jpg Preview by Yahoo truly, Julian ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Some PDFs no longer available from Google
I do get to a downloadable PDF. This is probably a question of jurisdiction, as the rules for public domain are different in different countries. I'm in the US -- where are you located? kc On 10/14/14, 6:12 AM, Laurence Penney wrote: Hi, Any idea what’s going on with this book? (I’ve seen it with others too.) https://archive.org/details/londoninillustr00frygoog The PDF link takes me to the book’s page on Google Books, but there I see “No eBook available”. I wonder if the PDFs were archived separately, and if so, whether the Internet Archive might choose to make them available again independently of Google. - L ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Some PDFs no longer available from Google
On 10/14/14, 6:52 AM, Luc Gauvreau wrote: I'm in Canada. I know it's one of the reason, but I search a lot in Google Books, IArchives All, and it's really not clear why digitalised copies of a book still available in one place or in an other. Because those different sites are making different decisions. Each site (IA, HathiTrust, Google) makes different decisions based on what they perceive as their risk. Some of the decisions may be wrong for some books, but it isn't feasible to make decisions on a book-by-book basis. Note that HathiTrust is working to develop a true analysis of whether books are in the public domain or not. Many books published in the US between 1923 and about 1960 are actually in the public domain if they were not renewed with the US copyright office. The best 'algorithm' for public domain (which assumes that you have all of the pertinent information -- which isn't hardly ever the case) is this chart by Peter Hirtle (US law only): https://copyright.cornell.edu/resources/publicdomain.cfm This gives you an idea of how complex it is. In DPLA, for a book outside US, the limits of the public domain is 1873!? 150 years! I don't know why and how they choose this date, but it's far, far away from 2014-2015... The 150 years is a presumed life (actually, death date) of the author plus 70 years which is the Berne Convention rule. The US did not sign the Berne convention until the 1980's, so our rules for items published prior to that time are different. kc Luc Gauvreau (Montréal) 2014-10-14 9:37 GMT-04:00 Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net: I do get to a downloadable PDF. This is probably a question of jurisdiction, as the rules for public domain are different in different countries. I'm in the US -- where are you located? kc On 10/14/14, 6:12 AM, Laurence Penney wrote: Hi, Any idea what’s going on with this book? (I’ve seen it with others too.) https://archive.org/details/londoninillustr00frygoog The PDF link takes me to the book’s page on Google Books, but there I see “No eBook available”. I wonder if the PDFs were archived separately, and if so, whether the Internet Archive might choose to make them available again independently of Google. - L ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 tel:1-510-435-8234 skype: kcoylenet/+1-510-984-3600 tel:%2B1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] finding if a book has an ebook (API)
On 9/30/14, 8:26 PM, Charles Horn wrote: I don't know about the subject 'in library' -- it doesn't sound like a reliable way to determine if a book has an e-book. Clicking on the subject gives a list of 241,107 works / 239,752 ebooks so it doesn't look like a valid conclusion to draw. The subject field should be the subject of the work, these look like bad / misplaced data to me. Maybe someone else knows what this is about. Unfortunately, a previous director of the OL project was an advocate of throwing everything into the subject field, so there are things there like Protected DAISY which tells you the access status of the ebook. That it is an ebook is determined, I believe, as you do below, which is to use the link back to the Archive's stored copy. kc However, when I go to the actual book https://openlibrary.org/books/OL1737246M/Testing_computer_softwarehttp:// I don't see the 'subjects' field. When I use the API I get the same behavior, i.e,. for the OL1940521W https://openlibrary.org/works/OL1940521W/Testing_computer_software record I see 'subjects', but not for OL1737246M Can you help me make sense of this? I have had a look at the book in question and how it shows up in the ruby gem, what I think you want to see is the 'ebooks' field as documented in the books API https://openlibrary.org/dev/docs/api/books The API request for the book you mentioned is https://openlibrary.org/api/books?bibkeys=OLID:OL1737246Mjscmd=data which has the ebooks field there: ebooks: [{formats: {djvu: {url:https://archive.org/download/testingcomputers00kane/testingcomputers00kane.djvu;, permission: restricted}}, preview_url:https://archive.org/details/testingcomputers00kane;, availability: restricted}] I was about to tell you the ruby Gem doesn't support this field, but thats not true... You won't be able to see the ebooks if you use the Rest Client, which is what I assume you tried first. You'll need to use the Books Client / Openlibrary::Data : data = Openlibrary::Data book =data.find_by_olid('OL1737246M') book.ebooks = [{formats={djvu={url=https://archive.org/download/testingcomputers00kane/testingcomputers00kane.djvu;, permission=restricted}}, preview_url=https://archive.org/details/testingcomputers00kane;, availability=restricted}] Hope that answers your question! Regards, Charles. ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600 ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] why apparently scanned books are not always available as ebooks?
A book from 1968 is still considered in copyright. The out-of-copyright cut-off for the US is 1923. Determining ACTUAL copyright (not just looking at the date) is very complex. Here's the chart on how to determine it: http://copyright.cornell.edu/resources/publicdomain.cfm Anything published after 1964 with a copyright notice is in copyright for 95 years. Come back in 2058 :-) I don't know about the later editions being available as ebooks. They may have been scanned, but unless the copyright holder has released them, they shouldn't be openly available. kc On 5/5/14, 5:10 AM, Andre Robatino wrote: As an example, I am interested in an ebook for https://openlibrary.org/books/OL5613510M/The_economic_problem (specifically the 1st edition from 1968, not the later ones). Strangely, there are ebooks for a few later editions, but not the first. If it was due to copyright issues, I'd expect the opposite. If I click on the Internet Archive link https://archive.org/details/economicproblem00heil and then take the URL of the animated GIF https://ia600506.us.archive.org/16/items/economicproblem00heil/economicproblem00heil.gif?cnt=0 and go up a level https://ia600506.us.archive.org/16/items/economicproblem00heil/ there is a listing of what looks like a complete scan of the whole book. (I can't be sure since most of the files can't be downloaded. Is it a security issue that I'm even allowed to list the files?) So my question is, if the whole book has in fact been scanned, why is it only available as a protected DAISY, and not an ebook, although a few later editions are also available as ebooks? More generally, is there a huge amount of scanned content that's unavailable due to some kind of programming issue, and not copyright? ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] Ebooks to borrow, was Re: why apparently scanned books are not always available as ebooks?
Andre, I looked at the OL page, and the later editions of the books there that are listed with the link ebook are available for borrowing, not for download to keep. The Archive has a library-related ebook borrowing program, so you may be able to borrow the book, but not download it to keep. kc On 5/5/14, 5:10 AM, Andre Robatino wrote: As an example, I am interested in an ebook for https://openlibrary.org/books/OL5613510M/The_economic_problem (specifically the 1st edition from 1968, not the later ones). Strangely, there are ebooks for a few later editions, but not the first. If it was due to copyright issues, I'd expect the opposite. If I click on the Internet Archive link https://archive.org/details/economicproblem00heil and then take the URL of the animated GIF https://ia600506.us.archive.org/16/items/economicproblem00heil/economicproblem00heil.gif?cnt=0 and go up a level https://ia600506.us.archive.org/16/items/economicproblem00heil/ there is a listing of what looks like a complete scan of the whole book. (I can't be sure since most of the files can't be downloaded. Is it a security issue that I'm even allowed to list the files?) So my question is, if the whole book has in fact been scanned, why is it only available as a protected DAISY, and not an ebook, although a few later editions are also available as ebooks? More generally, is there a huge amount of scanned content that's unavailable due to some kind of programming issue, and not copyright? ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] OL cover puzzle
Wow. that was quick! :-) So, at least we have a start. Absolutely nothing to do with Giotto, AFAIK, so the link remains an intriguing mystery. Thanks, Charles! kc On 12/22/13, 3:14 PM, Charles Horn wrote: This looks to be the original source of the Penguin cartoon: http://www.jgoode.com/big-mouth-strikes-again-this-time-its-penguins/ I don't know what this designer has to do with Giotto though. Charles. On 23 December 2013 12:02, Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net wrote: This book in OL: https://openlibrary.org/books/OL22985410M/Giotto. Is this book in Amazon: http://www.amazon.com/s/ref=nb_sb_noss/186-9380498-5837622?url=search-alias%3Dapsfield-keywords=0789448513 Can anyone figure out why the cover art is a penguin cartoon? (If we had a little bit of prize money, we could make a real game out of these kinds of anomalies) kc -- Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 tel:1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] When are new scanned books added?
Andre, there was a lengthy bit of time recently (close to a year, I believe) when new additions to OL were happening irregularly because of some background work taking place. There was also a flood of works that came in as a large package (I believe around 250K) not long ago. My understanding is that books are added as they are scanned and the metadata is made available. These two recent events could have given a different impression. kc On 12/6/13, 10:12 PM, Andre Robatino wrote: I discovered this site in early 2011, but was only able to find an ebook for one of the books I searched for. Early this year, I found that many more of them were available. Since then, I have not been able to find any new ebooks. Do ebooks get added in large batches once every year or so? If so, when does this usually happen? And did the recent fire delay the process? I just joined this list. These questions probably get asked often here, but unfortunately message archival for this list has been disabled, so there's no way to check. ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Author merges incomplete?
Sarah, I wish I could help, but I think this is a known bug that we haven't figured out how to fix. kc On 12/5/13 7:38 AM, Sarah Breau wrote: Question: I want to edit Beam of Malice by Alex Hamilton. When I search authors for Alex Hamilton, the results tell me he has *13 books* about Short stories, Horror tales, Description and travel, Buildings, structures, Bibliography, including /Beam of malice./ When I click on the author, there are only 7 works there, and Beam of Malice is not one of them. When I search titles for Beam of Malice, no results are shown. Does anyone know how I can get into the work I want to edit? Sarah ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Author merges incomplete?
great hint! Jessamyn, do you think this would be appropriate for the FAQ? kc On 12/5/13 8:29 AM, Samuel Henderson wrote: In terms of simply getting to the work page, a Google site search will often work where the internal search fails: https://www.google.com/search?q=beam+of+malice+site%3Aopenlibrary.org Cheers, Sam Samuel Henderson Translations http://www.samhenderson.net On Thu, Dec 5, 2013 at 9:38 AM, Sarah Breau smbr...@hotmail.com mailto:smbr...@hotmail.com wrote: Question: I want to edit Beam of Malice by Alex Hamilton. When I search authors for Alex Hamilton, the results tell me he has *13 books* about Short stories, Horror tales, Description and travel, Buildings, structures, Bibliography, including /Beam of malice./ When I click on the author, there are only 7 works there, and Beam of Malice is not one of them. When I search titles for Beam of Malice, no results are shown. Does anyone know how I can get into the work I want to edit? Sarah ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Wikipedia links, was: Re: Grants, was: Re: Author merges incomplete?
On 11/22/13 1:58 AM, Ben Companjen wrote: Having links to Wikipedia instead of copying descriptions from Wikipedia could help clear up one licencing issue, that of CC-BY-SA content in what we/some of us would like to be CC0. Mashups (or GreaseMonkey scripts) could retrieve the Wikipedia abstract on the fly (then you'd always have the latest version too). Maybe some caching could be allowed to save bandwith. Ben, I think the copying is fine, but I thought that there was a from Wikipedia statement, which, however, I do not see. Of course, since anyone can edit... it would have to be non-editable. BBC does this, with a link back to Wikipedia so you can edit it there. The cc-by-sa does allow copying, it just asks for attribution. Links to Wikipedia appear in the records as 'normal links' and URIs in 'wikipedia' fields (this is the old way). Using this information it would be cool to create an overview of coverage of OL authors in Wikipedia, and vice versa. Wikipedia is free to retrieve lists of works from OL (although with the current data quality I'd wait a bit more before doing so). FYI, work on coverage of scientists in Wikipedia (haven't read it yet): Samoilenko, Anna, and Taha Yasseri. “The Distorted Mirror of Wikipedia: A Quantitative Analysis of Wikipedia Coverage of Academics.” arXiv Preprint arXiv:1310.8508 (2013). http://arxiv.org/abs/1310.8508. Fascinating article. I'm not surprised that Wikipedia and Academia vary, nor that scientists are miffed about that :-). The obvious conclusion is that if academics want their view of the world reflected in Wikipedia they need to start editing -- which reminds me that there is a project being undertaken by medical schools to make sure that medical information in Wikipedia is correct. Rather than complain that it's wrong, they realized that there is a real social consequence to wrong info in Wikipedia, so they are correcting it. THAT, to me, is the approach to take. kc Ben On 22 November 2013 02:04, Karen Coyle kco...@kcoyle.net wrote: Fabian, I, too, Wikipede (-;)) and the ready-made WP templates on OL are not well enough known among Wikipedians. I LOVE THEM! So we should definitely try to make that more visible. I would love to see more linking between OL and Wikipedia, in general. As you probably know there is a lot of work going on to bring WP and libraries together, including the Wikipedia loves libraries campaign. This may be another area where we can find energy/reasons to make the necessary improvements to OL. Maybe an OL loves Wikipedia project? Already there are many Wikipedia links in OL, but more would only be better. Thanks for writing. kc On 11/21/13 3:51 PM, fab...@unpopular.org.uk wrote: I kind of wandered into OL from Wikipedia and what I thought was great was: *Generating a Wikipedia citation template *Being able to put the OCLC reference in as well which means that a Wikmedian can add a reference to an article that includes a link to World Cat. Any reader can use this to find the nearest participating library that has a copy of the book. So now when I am adding info from a book, I track down the book on OL, make sure it has the OCLC info and cut and past the reference. Does this take me less time? Not at first, but if I come back on another day, yes it's easier. One draw back is that OL is not very user friendly. Lots of books are duplicated authors often appear under a variety of names. Perhaps i do a little bit, but last time I mentioned that a book was duplicated all that happened is that I got an e-mail back, giving some reason why nothing could be done about it (as you can see i kind of lost interest, and there was no easy way I could back track to the relevant books). One thing I have noticed amongst London Wikimedians is that not many know about OL and its readymade citations. there is a discussion going on about some joint work between Wikimedia UK and Thurrock Libraries (a public library network in a small town about twenty miles away from London). Now I think it would be really neat if people with research interests could click through Wikipedia to World Cat to find the nearest library copy of a specific book to them. And I see OL could play a significant role in this . . . but it does need to be easier to use. One thing I am not clear about is, to what extent do the sort of library staff I meet in my local library know about OL, World Cat etc. Last time I went in my local library to ask about some ICT training that was offered, I had to show them the page online where it was offered, and they admitted they knew nothing about it (I tried more than one library in my borough). It seemed to me that there was someone a bit removed from front-line service delivery trying to get somethings done, but without the front-line staff being effectively put in the picture. I would be interested if this sort of synergy makes sense to people
Re: [ol-discuss] Wikipedia links, was: Re: Grants, was: Re: Author merges incomplete?
On 11/22/13 6:44 AM, Nicolás Tamargo de Eguren wrote: For what it's worth, that was also our choice in MusicBrainz: store links, and show Wikipedia extracts with a read more link (we cache them for a while IIRC though, not always re-request them). AHA! I bet that's where I saw it, because I've been digging around in MusicBrainz lately. It sometimes gives meh results when Wikipedia articles don't have a good abstract, but then that's something people can go and fix :) Also FWIW, our likely plan for the future is to move towards storing Wikidata IDs rather than Wikipedia links directly, since those join together every Wikipedia's page for the artist, album or whatever (authors and works for you I guess?) and make it easier to display the right language depending on interface language if you translate the site. I realize this is an OL list, but anyone who does Wikipedia editing should also be aware of: http://www.wikidata.org/, especially if you can help combine data from different language wikis. Multi-lingual editors are needed. kc Cheers, Nicolás -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Grants, was: Re: Author merges incomplete?
doesn't respect volunteers' time enough to provide a minimally functional system. Blaming is not going to help. Lets try to identify the big issues that needs to be fixed and see if you can help in someway to solve it. I think important issues are: * fixing search engine * importing modern books I've tried to fix some of the issues of search engine, but some of the old edits which didn't get into search engine are still missing. Please let me know if there are any issues that you think are important. Anand ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] MARC21 Records - Municipal Library of Prague
Vojtech - I can look at the MARC records. And I will bring your question about loading digitized books to someone who can answer that. kc On 11/12/13 2:42 AM, Vojtěch Vojtíšek wrote: Hello, I was wondering if there is going to be an option to load records in OL using OAI-PMH protocol? I uploaded a set of records which have been through complete catalogization process on archive: https://archive.org/details/Mlp-records-marc21. Could you please check if those records (file is in xml, records are in marc21 and there should be over 200 000 of them) are compatible with OL and could be used within? Also, we have over 4 000 digitized books (records of which could be exported separately) and about 370 e-books (in pdf, epub, prc and html) published by us under CC, is there a way to list them in OL for online reading, too? Thanks for your response, Vojtech Vojtisek Municipal Library of Prague vojti...@mlp.cz mailto:vojti...@mlp.cz | www.mlp.cz/en http://www.mlp.cz/en ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Making OL search default to ebooks and not all records. Thoughts?
Jessamyn - Actually, that was done once a while back, and didn't work because there was no way to turn it off. So all you could get were the ebooks records, none of the others. (Every time you hit search it turned it back on, even if you'd just unchecked the box.) Clearly, it has to be done a bit more carefully. Ideally it could be set for a session or by user, but I assume that is much more work. Here's another issue: ebooks are at the edition level, and search and display is at the work level. So you will get all works with one or more ebook editions -- but your list, when you get down to editions, will include non-ebook editions. Since the ebook editions appear at the top, that probably won't be a problem. However, I don't know if you are suggesting that only the ebooks should display in the edition display. My concern then would be how to let people know that they are only searching on ebooks, and that there are other records in the database. I would see nothing wrong with having a separate database that is ebooks only - or at least making it seem that way, virtually, and billing it as a database of ebooks. However, the API uses of OL are generally against the whole database. It's like we've got two different sets of functionality. Hopefully, we can satisfy both needs with a single database. kc On 11/2/13 8:44 AM, todd.d.robb...@gmail.com wrote: I think that's a great idea even though I primarily use the site for research purposes. It should increase use of the collection as well. On Saturday, November 2, 2013, jessamyn c. west wrote: There's been some discussion at the Archive of ways to make Open Library a bit more usable to folks who want to use it for book reading and borrowing. One of the ideas was to make the site search default to ebooks instead of records (i.e. having the ebooks box checked by default which is not how it is currently) so that the results people return with a default search will be things they can access immediately with the idea that a lot of the people coming to OL are looking for books to read, not just library records. I offered to toss a note up to this list to gather feedback for people. What do people think about this as a possibility? _ Jessamyn West volunteer support, Open Library ___ Ol-discuss mailing list - Ol-discuss@archive.org javascript:; http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org javascript:; -- Tod Robbins Digital Asset Manager, MLIS todrobbins.com http://todrobbins.com/ | @todrobbins http://www.twitter.com/#!/todrobbins ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Is it legal for an edition to be linked to multiple works?
Not in OL, I wouldn't think. There is the whole anthologies issue that has FRBR-thinkers tying their brains in knots - where a book is published with multiple works inside, and each of those works gets a work record. But OL simply treats the book itself, regardless of how many theoretical works it may contain, as representing a single work. I'm worried that many of the links in OL, whether between works and editions, or authors and works, or variant forms of author names, seem to be in error. And this is a difficult kind of error to correct. Or, with my most optimistic hat on, is it not? I'd love to hear such good news. kc On 10/9/13 11:37 AM, Tom Morris wrote: This edition http://openlibrary.org/books/OL23070031M.json is linked to two separate works : http://openlibrary.org/works/OL5750935W/Cardcaptor_Sakura_Volume_5 http://openlibrary.org/works/OL5750773W/Cardcaptor_Sakura_Volume_5 Is this ever legal? On the surface it doesn't seem like it should be and it's probably also a sign that the two works need to be merged. Tom ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Hello, I'm a new member on an iPhone.
Ann, I'm not expert on either accessibility nor the Open Library reading software, but I did just try to use the reader on my iPhone. On my computer, when I open the reader, there is a sound icon in the upper menu bar that turns on the 'read aloud' feature. On the iPhone in Safari that upper bar is not present, therefore I do not see how to turn on 'read aloud.' Do we have anyone on the list who knows details of the reader software? Ann, have you used 'read aloud' on the computer, and if so is that what you are looking for on the iPhone? kc On 9/25/13 8:48 AM, Ann wrote: Hello, I just joined your mailing list. I'm using an iPhone 4S. I am completely blind, and was wondering how I would go about using your library books and either Safari or a special application that I need to download. Thank you in advance for any tips tricks or advice you may offer. ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Hello, I'm a new member on an iPhone.
Richard The OL book reader presents an image, not selectable text. There are other formats, but access to them is less direct, and therefore less accessible, most likely. kc On 9/25/13 12:15 PM, Richard Cox wrote: This should help: 1. Open Settings. 2. Select the General tab. 3. Scroll down and select the Accessibility tab. 4. Select the Speak Selection option (it should be set to off, currently). 5. Select the toggle switch to turn it on. You still have to highlight the text, but then you can select Speak and that should work. Richard Cox Digital Technology Consultant Electronic Resources Information Technology University Libraries, UNC Greensboro http://library.uncg.edu/ On Wed, Sep 25, 2013 at 2:57 PM, Ann cradlingarm...@gmail.com mailto:cradlingarm...@gmail.com wrote: Hello, my iPhone is my only computer. I don't have a regular desktop computer. Sent from my iPhone On Sep 25, 2013, at 1:51 PM, Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net wrote: Ann, I'm not expert on either accessibility nor the Open Library reading software, but I did just try to use the reader on my iPhone. On my computer, when I open the reader, there is a sound icon in the upper menu bar that turns on the 'read aloud' feature. On the iPhone in Safari that upper bar is not present, therefore I do not see how to turn on 'read aloud.' Do we have anyone on the list who knows details of the reader software? Ann, have you used 'read aloud' on the computer, and if so is that what you are looking for on the iPhone? kc On 9/25/13 8:48 AM, Ann wrote: Hello, I just joined your mailing list. I'm using an iPhone 4S. I am completely blind, and was wondering how I would go about using your library books and either Safari or a special application that I need to download. Thank you in advance for any tips tricks or advice you may offer. ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net mailto:kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 tel:1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Edit/author issue
Thanks, Jessamyn - On 9/11/13 1:40 PM, jessamyn c. west wrote: I think one of the things we need to get from Anand is an idea of what maintenance functions are active on OL, and if there is a way that we can monitor them. I have a loose idea about some of that. Author merge (and unmerge which I can do I'm not sure if other people can) is active though it can time out. Edition merge is non-functional and always has been to the best of my knowledge. Here are some others I am wondering about: - If you add a new edition, does it go through the edition merge algorithm? And does it get added to a work, if one is appropriate? - Is there a search index update process that runs always or regularly? Any idea about these? kc There is theoretically a back-end way to sort of sent a quick re-index request to the internal search/index on a per-page basis but I have not had a lot of luck getting it to work myself. There are a lot of out of date things appearing as a result of searches and on some static (e.g. author) pages. I'd be really happy to be part of an email or even IRC discussion about this. I've added some things to the wiki's Ideas page based on some non-functional things that affect users and my ability to support users. http://openlibrary.org/community/ideas Jessamyn ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] all translated editions on the same page as the original edition?/2
On 9/6/13 3:42 PM, Colby Russell wrote: On 09/06/2013 06:08 AM, Karen Coyle wrote: The edit screen does not appear to allow one to input additional titles, although I don't think that the metadata format would have a problem with that. I will add this as a suggestion: add a field for additional titles -- this would be useful for many books whose titles can have variations. I know this from some edits I made yesterday: there are two fields in Librarian Mode, one with the text Is it known by any other titles? (Perhaps in another language?), and another What's the original book?, if you select Yes, it's a translation. I had no idea that was there. We definitely need more documentation! The whole thing is confusing to me, though. These fields are only available on the edition's page, and not the work's page. For Dead Souls http://openlibrary.org/books/OL7084842M/Dead_Souls, I ended up putting the transliteration of the alternate title in both fields. I suspect that this is an artifact of the order of development, since initially OL only had edition pages; the gathering of editions into works was done afterward. I think this book is a particularly good use case for measuring whether OL has sufficiently ironed out the issues here. There's enough here to raise a few questions. Quite a few, actually :-). Some of them are OL questions, though, and some are where should we put things? questions. In standard library practice there is a place for everything, and everything in its place. There are also collective decisions based on rules, so everyone knows what the work title should be, or at least they know where to look it up (here in the US that's the LC catalog). There was great reluctance in the development of OL to establish rules for data input. I can understand that, since OL needs to be editable by anyone. However, it seems a shame to make each of the more conscientious editors have to struggle through the whole process of thinking through what the rules ought to be. That said, I'm not sure where to begin to develop a set of answers that are more oriented to the general public than the highly complex library rules that require training to use. If anyone has a brilliant idea about this, I would love to hear it. * The original title is variously translated as The Wandering of Chichikov, The Adventures of Tchitchikov/Chichikov, or Chichikov's Journey. * Похожденія Чичикова […] is what appears on the title page to the original 1842 edition (and at least 1846, as well). The use of і seems to be particular to Ukrainian, though. and today the work seems to be more commonly referred to as Похождения […] compared to Похожденія […]. * Today, some letters in мертвые души may be replaced with their accented forms http://ru.wikipedia.org/wiki/%D0%9C%D1%91%D1%80%D1%82%D0%B2%D1%8B%D0%B5_%D0%B4%D1%83%D1%88%D0%B8. * There are several ways to transliterate the title. One of them appears in the handwritten text in the IA scan of that copy's title page. Here's an example of how that is handled in current US library practice: http://lccn.loc.gov/95038157 There are standard transliteration forms (I don't know anything about the one for Cyrillic) that are used, regardless of what is found on the book. In fact, an alternate transliteration, if found on the book, would become one of those 'other title's. Also note the existence of the Virtual International Authority File, that brings together these kinds of decisions from libraries around the world. Take a look at: http://viaf.org/viaf/183487532/ Unfortunately, VIAF does not supply a single display form. Fortunately, if we could make the connection, it does seek to provide a single identifier for each person and each work. Unfortunately, it is new and the identifier hasn't been propagated out to bibliographic data. The free form text field for the title in the Yes, this is a translation area in Librarian Mode should probably go away. Specifying that it's a translation and the source language should be sufficient and non-destructive, given that the works page is meant to list all editions... If it's known that the translator worked from a *specific* edition (or editions) during the translation process, it would probably be best to include an optional field for that, which can be filled by a widget listing the editions in that language. But not a free form text field. I doubt if we should go down the rabbit hole of translation from specific editions. We have enough problems as it is. Librarian Mode's alternate title field in its current form should probably go away, too. The help text asks Is it known by any other titles? (Perhaps in another language?). This is mistargeted, since if it is, it's likely because there's an edition available in that language, which should be listed on the work's page, anyway. It seems that the text there should be changed. It may be a holdover from
Re: [ol-discuss] all translated editions on the same page as the original edition?
Tim, Thanks for your interest and work on OL. I can try to explain the principle behind the treatment of translations. First, it comes from formal library practices. Second, not all of the inputs to OL follow those practices. So there is a theory that is not always used in the practice. The principle is that the Work record in OL should represent the ORIGINAL work, with the title in the language of that original. The original and all translations should be linked to this Work record. To help bring translations together, you need to put the title of the original in the upper Title box in the editing page. The title of the translation then goes into the Title box under This edition. I did this change on this record: http://openlibrary.org/works/OL1260869W/Rivage_des_Syrtes You can do a compare between the previous version and this version to see what I mean. Unfortunately, I do not know how long it takes before the newly edited record merges with the correct Work record. It does not happen instantly. I would like to see more information added to the editing pages, perhaps as a small ? icon near each field that can provide this kind of information. Meanwhile, I will add this explanation to the help page if I can. kc On 9/4/13 3:24 PM, Timothée Flutre wrote: Hello, I just discovered the existence of openlibrary.org http://openlibrary.org and like the project very much. So I started to contribute but don't know what to do with translations (see an example at the end of the email) and didn't find any info in the help section of the website. 1) Should there be one page per book, with all editions, in the original language or not, listed on the same page? I think this is preferable as it is more practical when searching the library. 2) But then should the title of the work always be in English, even if originally it was not written in this language? Here again I am in favour of the title of the work being in English, but obviously the title of each edition should be in its original language. 3) Finally, should then the title of each edition be written also in its original script (for instance in cyrillic)? Does the current code allow that? Here I don't have any strong opinion. Best, Tim ps: see for instance the page Rivage des Syrtes (http://openlibrary.org/works/OL1260908W/Rivage_des_Syrtes) which only lists two English translations whereas there exists another page, Le rivage des syrtes, which only lists four French editions. ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Zero works
On 9/2/13 3:30 AM, Patrick Conley wrote: Zero works means that there is an edition linked with this author but not a work. A long time ago work pages were created for most of the editions, but not for all editions. BTW: Imho it would be helpful if OL would distinguish between a person (an individual author) and a name (VIAF used the term undifferentiated). The two kind of pages could be marked by using colors: - a white author page (for example) would imply: name, disambiguation, unverified information - a light blue page: person (with year of birth, occupation, external links etc.), verified information I empathize with this, Patrick, but there are many individual authors without year of birth -- the cataloging rule in the Anglo-American community has been that the first author with that name (John Smith) does not need additional information, and subsequent authors with the same name are distinguished using year of the birth or other information. I cannot defend the logic of this, but there it is. The only way to know which is undifferentiated would be to have the names under authority control. However, the author strings in OL have been modified from the original input (e.g. FROM: Tolkien, J. R. R. (John Ronald Reuel) TO: J. R. R. Tolkien). I wish that the original library authoritative name had been stored somewhere in the record, but it was not. Having that would allow us to link to VIAF, and therefore we could know which names were settled in terms of identity. It still would be possible to retrieve names from the original MARC records, but that seems to me to be a bit more work. If we had the links to VIAF then 1) we could use the form with the link to VIAF as the one to merge with 2) we could concentrate efforts on the names with no such link. kc patrick Am 02.09.2013 01:55, schrieb Karen Coyle: Thanks, Tom. I'm idly working through starting near the bottom. I notice that some of the authors have zero works -- I'm not sure how this happens but I think I've seen this before . I assume it was a merge or clean-up that didn't fully clean up after itself. I'm merging these anyway. kc ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Merging authors
the same person. I have discovered the de-duplication magic wand, and have done a few by hand. However, I am rather puzzled. For example, the last person I looked at was A. Hamon (1860-1939). In my Modes data I have two records for him, both with dates: http://openlibrary.org/authors/OL5218117A and http://openlibrary.org/authors/OL5358432A Both of these URLs dereference to an actual page, with associated works. However, in the de-duplication listing only the first of these identifiers is present (though I did find another A. Hamon entry to merge). So, two questions: 1. Is there a format in which I can express a set of instructions to merge authors programmatically, to avoid having to do this by hand? The excitement of doing this manually has already worn off, but Modes could easily tell me where authors have the same name and same DoB/DoD and help me to generate a list of identifiers to merge. 2. Why don't all the potential mergees appear in the merge listing, despite the fact that loads of clearly irrelevant entries do appear there? Thanks, Richard [1] http://modes.org.uk -- *Richard Light* ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Volunteers needed to help dedupe author records
Thanks, Tom. I'm idly working through starting near the bottom. I notice that some of the authors have zero works -- I'm not sure how this happens but I think I've seen this before . I assume it was a merge or clean-up that didn't fully clean up after itself. I'm merging these anyway. kc On 9/1/13 10:13 PM, Tom Morris wrote: Richard Light has generated a list of candidate author record duplicates that we need your help with verify. I've uploaded the list to a Google Docs spreadsheet where everyone can help work on it: https://docs.google.com/spreadsheet/ccc?key=0Ak8syCNT8x2DdDJrb0NYaHNmSlpacnV0SkI4MTQtQXc#gid=1 There are two columns of URLs. The right hand column contains a merge URL which, when clicked, will show you the two candidate records and ask you to confirm the merge. Please select the record with the best name form and most works as the primary record, double check that they're really the same author and then click merge. Alternatively, click the search URL and see if there are other candidates that could be merged with slightly different name forms or perhaps missing a date. This will clean up more records, but requires attention to detail to make sure you don't actually merge two different people together. If you are at all unsure, something is a match, leave it out. If the winning record has the name in inverted form (ie Smith, John), you earn extra brownie points for editing the author record and cleaning it up. Please *delete the URL* from the cell for whichever option that you choose so that your co-workers know that you've already processed that entry. There are about 2,000 sets of records on the list, but I'm betting that we can make pretty short work of the cleanup. Thanks for your help! Tom p.s. If you're using the search URLs, you may occasionally find a case where the search returns no results due to search index problems. In this case, please just click on the direct merge URL. ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] New community page for ideas
On 8/31/13 8:48 AM, Ben Companjen wrote: Authors in OL may also be corporate entities (government bodies, corporations, NGOs), conferences or group pseudonyms (although those are more rare). There is no difference in the data yet, except for a few records. As an FYI - in the incoming MARC records any x10's were mapped to collaborator. At least, that's what I recall. The code starts around line 330 in: https://github.com/openlibrary/openlibrary/blob/master/openlibrary/catalog/marc/parse.py Unfortunately there were a lot of bad sources that put all of the authors in 100 regardless of type. Our thinking at the time was that only librarians really think of corporate bodies as authors; most people assume that author means person. Putting the corporate authors in collaborator was kind of a cop out, but we weren't up to inventing any entirely new bibliographic category. This is why I am trying to preview any new sources before we load them. There are some that were loaded that were MARC in structure but not really in content. They are the source of many of the duplicates in the database (and because the data is bad they don't merge well), and I don't want us to make that situation worse. kc the RDF provided for authors could make better use of the information currently available. For a start, it should include a list of is author of X statements to link them to their works within OL. Agreed. In the Work RDF there are links to Editions, so it's possible. I already proposed some changes to the RDF output some time ago: https://github.com/internetarchive/openlibrary/pull/136 but this was not in it. It should also include Wikipedia identifiers where these are present in the data By 'identifiers', did you mean URLs? There are Wikipedia URLs (Links) for some people. Some records include a special wikipedia field. with a little gentle encouragement, we could make the author birth and death date information usable in a machine-processing sense. Most dates are already useful as entered, despite the lack of guidelines we could enable the (structured) recording of place of birth and death. There are a handful of these in the data already, crammed in on the end of the date field A bot should try to parse the dates and put these in the records in a separate field (e.g. date_of_birth_parsed). The contents of this field can then be transformed to an xsd:date value in RDF. Author names could be looked up on dbpedia, and if there is an existing entry (a) the link can be included in the OL data and (b) details like DoB/PoB can be copied from that source into the OL data. It is debatable whether that is allowed in accordance with the CC-BY-SA licence that Wikipedia and DBpedia use, although we're not too strict on the enforcement and don't use a less strict licence on the OL data. Looking up a name in DBpedia could be challenging, but experimenting is easy when you already have both datasets downloaded in dump files. Richard Ben ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] Next steps for OL
I still have on my plate to do an analysis of the recently released records from Germany. However, I think it makes sense at this point to prioritize any updates to OL. As far as I know, there have been no updates from Amazon or Library of Congress for a very long time. That means that the database is out of date for the most current items. Since some libraries are using OL for their covers, it seems that bringing the database and coverstore up to date should be a high priority. I'm hoping that the Amazon code is still viable, but needs to be revived. There is no longer a subscription to the Library of Congress file, but I put a note in the community projects page [1] about how it may be possible to download the LC records rest-fully. It may be hard to catch up (another reason to do this sooner rather than later), but it should be possible to automate an ongoing process of retrieving new items from the database. kc [1] http://openlibrary.org/community/projects bottom -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] German National Library Offers Over 11 million MARC21 Records Under CC0 Open License
If someone can pull off a few records (no more than 100, but preferably from the middle of the file), I can look at them and see how they fit with the OL data elements. kc On 8/26/13 5:40 AM, Johannes Baiter wrote: It's CC0, so yes. I already obtained the MARC-Files and uploaded them to the Archive, Raj has been informed as well, though I haven't heard back from him since. You can find the records here: http://archive.org/details/marc21_records_german_national_library All the best, Johannes 2013/8/26 Morten Juhl-Johansen Zölde-Fejér mj...@syntaktisk.dk mailto:mj...@syntaktisk.dk I noticed this piece of news: http://blogs.ifla.org/bibliography/2013/08/06/german-national-library-offers-over-11-million-marc21-records-under-cc0-open-license/ Is this compatible usable with OpenLibrary? Yours, Morten __ Morten Juhl-Johansen Zölde-Fejér http://syntaktisk.dk * mj...@syntaktisk.dk mailto:mj...@syntaktisk.dk ___ Ol-discuss mailing list - Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] No archive and authority files
Fabian, By way of explanation (but not of solution) - publisher names are not developed under authority control in the data sources that we have used. Instead, the publisher name is what is copied from the book without an attempt to normalize it. Creating authorities for publishers has many difficulties. Often what is on the book is not the publisher but the publisher's imprint -- that is, it is not the organization but the product line. Determining the actual identity of the publisher may take considerable time. For this reason, one generally simply records what is written on the book. In addition, what is written on the book is important for identifying the book, so it would not be good to lose the information like: * for Francis Grove, and are to be sold at his shop on Snow hil near the Sarazens head without New-gate There is some discussion of adding authorities for publishers to bibliographic data, but without removing the transcription from the page. This will, however, require a new field for the publisher authority. kc On 8/16/13 8:28 AM, fab...@unpopular.org.uk wrote: Hi, I just come to this after doing a few years on Wikipedia. UI think it is a very impoprtant project, and i love the way it generates Wikipedia citations. A couple of questions: why are the archives of this list disabled - I don't particularly want to go over old ground Also I put in a book published by Francis Grove, but I discovered that this publisher is known by several different strings of letters: * Printed by T. R. Cotes, for Francis Grove ... * Printed by E. Alsop for Francis Grove ... and William Gilbertson ... * Printed for Francis Grove (x4) * Printed at London for Rich. Cotes, and are to be sold by Francis Grove * for Francis Grove, dwelling upon Snow-hill * for Francis Grove, and are to be sold at his shop on Snow hil near the Sarazens head without New-gate * Printed for Francis Grove, and are to be sold by Martha Harrison * Printed by B. Alsop, and T. F[awcet] for Francis Groves dwelling on Snow-Hill neare the Sarazens Head well there may even be more. Is there any prospect of having an authority file for such publishers as Groves (active in London in the seventeenth century? all the best Fabian ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] No archive and authority files
The discussion that I am aware of has taken place around the BIBFRAME project: http://bibframe.org Look in the articles about authorities. kc On 8/16/13 10:16 AM, Samuel Klein wrote: On Thu, Aug 15, 2013 at 9:28 PM, Karen Coyle kco...@kcoyle.net wrote: There is some discussion of adding authorities for publishers to bibliographic data, but without removing the transcription from the page. This will, however, require a new field for the publisher authority. This would be really wonderful. Many people find authority files such as this a joy to contribute to. Is some of this discussion public? SJ -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] IA books not in OL and w/o MARC
I know that this has come up frequently, the question of records in the Internet Archive but not in Open Library. One category of these seems to be books originally scanned by Google that were uploaded to IA but that do not have MARC records. Since most of these come from an identified library, and that library surely does have a record for the book, is there a way to undertake a project to upload the MARC records for those books? kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] DPLA hackathons Thurs and Fri
See: https://github.com/lofhm/DPLA-api-hacking/wiki/Information No idea what will transpire, but if you are in the Bay Area it's an opportunity to meet others interested in digital libraries. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Goethe's redirect mess
Wow. This really is a mess. Goethe links to John Leonard Greenberg: {created: {type: /type/datetime, value: 2008-04-01T03:28:50.625462}, last_modified: {type: /type/datetime, value: 2012-10-06T06:14:51.772036}, latest_revision: 23, location: /authors/OL575040A, -- John Leonard Greenberg key: /authors/OL13193A, - Goethe type: {key: /type/redirect}, revision: 23} So there's no way to see the Goethe author page or anything authored by Goethe. Someone here knows how to retrieve earlier revisions ... Could we see a couple prior to #23 for /authors/OL13193A? Maybe someone merged these manually and we can roll back the merge? kc On 4/7/13 6:48 AM, Nicolás Tamargo de Eguren wrote: Hi! We've been linking to OL pages from MusicBrainz artists, and I'm quite confused about the situation with Goethe. See the results on http://openlibrary.org/search/authors?q=Johann+Wolfgang+von+Goethe Am I mistaken, or has Goethe been wrongly merged with a completely unrelated author? (twice!). If I'm right, how could we fix the mess? Can a merge be undone? Cheers, Nicolás ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Fundamental laws of online communities
Ben, thanks for the link and the reminder about the pages. I've added some of my bits to them. kc On 3/27/13 4:15 PM, Ben Companjen wrote: Hi 'community builders', A few days ago I saw this link in a tweet: http://www.feverbee.com/2009/10/11fundamentallawsofonlinecommunities.html - The 11 fundamental laws of online communities I don't have much experience in creating a community, but the laws seem reasonable and could be an inspiration for how a community around OL could be created. The applicability of these does IMHO depend on the relation with the Internet Archive (as these laws are written towards an organisation that wants to build a community around it/its service(s)). I guess it also depends on the goals that we have for OL as the thing we want to use/support. Some pages on the community pages are no longer empty, like the introduce yourself and the projects pages. That means there are ideas about what people want/do with OL :) http://openlibrary.org/community/introduce-yourself http://openlibrary.org/community/projects Ben http://openlibrary.org/people/bencompanjen - I'm thinking about putting something on the community pages about me as well :) ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] community pages
Folks, As Ben mentioned, the community pages are available. First, we need someone without any special privileges to make sure that they are editable by all. So go to http://openlibrary.org/community and see if you can edit. Thanks. Next, We'll need a bit of organization, so I was thinking of creating a kind of upper level page with some text and links to some logical divisions. The divisions that I thought of were: - projects using OL (where people can announce their projects and share ideas) - improvement projects on OL (for those folks making changes to OL, with bots or manually) - discussion area (a place for most everything else, including what is a book and various other interesting topics) Other ideas for a kind of top level roadmap? kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Book (was: Re: [ol-tech] Hosting on media-wiki)
Ben, if it's any consolation, libraries basically gave up on describing book and instead refer to textual material. Textual material can be monographic or serial, and of any length. The difference between a book and a pamphlet are not recognized -- although there is some assumption that pages are held together in both. Loose sheets of text are usually in the nature of manuscripts. But there are to hard rules. The e-document development really made any definition of book just about impossible. There's text, but no pages, no binding. There are spam records on OL for obvious non-books, but I would bet that some of the items for non-books have come in from Amazon. There *is* as TomM says, a practice of adding ISBNs for just about anything you can sell. I find it interesting that folks like Google and the Archive zeroed in on books when they thought about libraries, since libraries actually have so much more. However, when you see how badly Google as handled the digitization of journals, maybe it's just as well. kc On 3/17/13 5:58 PM, Ben Companjen wrote: (continued from another discussion on ol-tech) It's not an extensive discussion of every aspect of books, but I started (and for now, finished) a use case for Open Library: how I would like to use it for cataloguing (my) books. I put it on the wiki in a subpage of my profile: http://openlibrary.org/people/bencompanjen/cataloguing (it is a wiki, although if you can edit that page, you're either me or an admin). There is a short bit in it about the boundaries among editions, but assumes the definition of a book is understood (or perhaps: defined) by the reader. As far as I know, Open Library has never had a clear definition of book, let alone a strict enforcement of a definition. The web interface of course shows OL's expectation of what aspects of books can be described, but that hasn't stopped people from entering shoes, pills, and err, no wait... ;) On a more serious note, I've seen a lot more than the traditional books (judging by the format): audio and video recordings, brochures, objects and artefacts etc. Not books per se, but things you do find in libraries. Every definition that crosses my mind at this moment is at best incomplete, so I won't write any here. I'd say: if you think your thing is a book, that's fine with me (N.B.: you don't need my approval :D). Also, if it doesn't perfectly fit the edit form in the way you want, explain it in the notes. For my use case a partial description is fine if you can tell one edition from another. There must have been good discussions on this topic before, but I was too lazy to search the archives [1, 2] myself. Ben [1] http://www.mail-archive.com/ol-tech@archive.org/ [2] http://www.mail-archive.com/ol-discuss@archive.org/ On 17 March 2013 14:28, Tom Morris tfmor...@gmail.com wrote: On Sun, Mar 17, 2013 at 6:32 AM, Karl Eichwalder k...@gnu.franken.de wrote: Lee Passey l...@novomail.net writes: Right now, it appears to me that Open Book Catalog is lacking a vision and a visionary. Even the platitude one web page for every book is so broad as to be essentially meaningless. That is what we already have, so what's missing? Unfortunately, that's not the truth. Series, Volumes of the Works of an author, and Monographs are often deliberately mixed. And then, there are translations of the book. And digital faksimiles and digital reprints (such as proofed book by the distributed proofreaders and gutenberg.org). We have one web page per book, but it is undefined what a book actually is. For serious work the data we have is useless. And it looks impossible to do cleanups. With every import it gets worse. The data may be incomplete, it may be unreliable, it may be unreusable for legal reasons, it may be unreusable for technical reasons, and it may not lead to any actual content, but hey, there /is/ one web page for every book! Yes, but what's a book? That doesn't tell us anything about how you'd like book to be defined, what type of data would be useful to you or what your use case is. Why don't you join the thread on ol-discuss and let us know what SUSE would like from OpenLibrary. Tom ___ Ol-tech mailing list ol-t...@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to ol-tech-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Re: [ol-discuss] Data consistency policy
On 3/13/13 10:14 AM, kltrg wrote: Thanks for you answers. Karen Coyle: The how many pages field is needed to help merge duplicate records. It contains the highest number as a numeric (since the library pagination is a text string with punctuation, etc.). Just to make that clear: it's not highest number you can see written on a particular page in the book but the total number of pages (preface numbered i, ii, ... + main text numbered 1,2,...). Sometimes the last page(s) don't have a number on them at all. They count nevertheless, no? No, those blank pages are not counted. That would require everyone to hand-count all of the pages in the book, so listing the highest numbered page is more efficient. If you are referring to the name of the place of publication you should enter the place as it is on the book. Place names are a problem in this input field but also in the one I put the places mentioned in the book in. Summing up what I've read, I'll proceed like this from now on: Publication city like written in the book. All other cities and countries in their (most common) english spelling. Are you referring to places as subjects? Right now most of the subjects are in English. However, for places it would be great to link them to an actual database of place names, such as geonames.org, and/or with the appropriate entry in dbpedia. That would create an identifier for the place, but more work would be needed to allow searching on the alternate name forms. I once volunteered to write up a short hover text for each field, but it was deemed too restrictive. However, once again I am willing to do a fairly loose definition of each field (where I can). Having such a hover text would be great, especially for newbies. It would make the whole thing a lot easier to understand. I'm strongly in favor of this. I could write it, but I don't think I can install it. I'll ask. There are advantages and disadvantages to having ingested library data since it has some very detailed aspects. Ideally, it would be good to find a way to allow both that detail and less detailed input from non-library sources and normal human beings. Isn't this question the whole point about having a librarian and a non librarian mode? Would it be a good idea to have both in the frontend, too? Yes, I agree that we could do more separation of the two, and also provide a simpler non-librarian input/update form. kc Where can we write down these policies and make them available more easily than by having subscribed this list? That's the whole permissions question. At the moment, it takes admin permissions to write to any of the documentation pages. kltrg ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] Brewster is interested in our community idea
Brewster has encouraged me to encourage the idea of creating a working community around OL -- folks who can participate in making changes and fostering sharing. That's all the guidance I have, so I'm assuming that we have to do the heavy lifting to make it happen. Can we work together on a proposal? I'm willing to do research and writing. Where would you like to do this? Wiki? G-Doc? something else? Are there others that we should include? I could contact library programmers who are using the API for covers, etc. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Search indexing
There's probably nothing you are doing wrong. The OL is undergoing a major revision to its underlying technology and while that is ongoing there cannot be updates to the search indexes. I don't know if there is an estimated time when search will be running again. kc On 2/24/13 7:52 AM, kltrg wrote: I've recently started to edit books here and there, adding details, but also adding editions or books. I noticed something strange and frustrating: new editions and books don't appear if I search for them. They don't even appear on the authors page. I thought it might be a caching issue and waited for some days, but they still aren't there. One example: http://openlibrary.org/works/OL16799419W/L%27apiculture_selon_Samuel_Beckett Do you know about this issue? Or am I doing something wrong? ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Amazon crawler code for open library
You might find it here: https://github.com/openlibrary/openlibrary/tree/master/openlibrary/catalog/amazon kc On 2/15/13 9:02 PM, David Arken wrote: Can someone please point me to the amazon crawler code + parser code that openlibrary uses currently for integrating data from amazon. I tred to find it using google but was unable to. I did find data on the internet archive which looks like amazon crawls, but not sure how new/old it is and how and what was parsed into ol already. Thanks, --David ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] What is the relationship between Open Library Internet Archive metadata?
Ben, there was a discussion that flew by me on the OL skype chat that someone had noticed that things weren't getting updated, and I believe that someone (Anand?) has restarted that process. kc On 2/1/13 2:46 PM, Ben Companjen wrote: Good questions, Tom. It looks like the IA page only uses the original MARC record. I imagine that the same record was imported to OL if it hadn't been before and that OL interprets MARC on input (discarding full names too, I guess). From what I read on the OL mailinglists, I can only answer the last one about the works not showing up with more certainty: the Solr indexes need to be updated. They provide the lists of works for authors, subjects and perhaps also the editions for works. There has been a chronical delay since the summer, I believe. I/VacuumBot had not yet discovered bulk edits and that appeared to have slowed the indexing process down, and it never seemed to have recovered. I added a work in December that hasn't showed up yet. I have been meaning to ask for a status update on the Solr indexes, but now seems a good time: any updates on Solr? :) Ben On 1 February 2013 18:21, Tom Morris tfmor...@gmail.com mailto:tfmor...@gmail.com wrote: Books which are on Internet Archive often have a link to their Open Library entry. What is the relationship between the two sets of data? Which is the master? How often are they synchronized? When I look at this pair of matching records from IA OL: http://archive.org/details/acompletedictio00kerngoog http://openlibrary.org/books/OL20509867M/A_Complete_Pronouncing_Dictionary_of_the_English_and_Slovene_Languages_for_General_Use The IA record had the authors full name which was missing from Open Library. The OL record had a different title variant than the IA record (both wrong in different ways). Now that I've cleaned up the OL record, will that might back to Internet Archive? As part of the cleanup, I merged three author records, but now when I navigate from this work up to the author record, the work isn't listed. Is there a time delay before all works show up? (I tried doing a hard refresh on the page). Tom ___ Ol-discuss mailing list Ol-discuss@archive.org mailto:Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org mailto:ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Anonymous edits disabled
There was a big bunch of recent spam. - kc On 11/24/12 7:04 AM, Roger Loran Bailey wrote: Why? On 11/23/2012 11:30 PM, Anand Chitipothu wrote: Hi, We've disabled anonymous edits on Open Library. Now only logged in users can edit or add records to Open Library. Anand ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Anonymous edits disabled
Yes, in fact the same spammers seem to now be registering lots of users, and spamming with them. I'm hoping that there will be a way to reverse the spam 'automagically' since they are inserting the same string into all of the records. kc On 11/24/12 9:57 AM, Alan Millar wrote: There's been a lot of spam for a long time. This is a good change. Unfortunately there are also OpenLibrary spammers who register, so this won't eliminate the spam, just reduce it. - Alan *From:* Karen Coyle kco...@kcoyle.net *To:* Open Library -- general discussion ol-discuss@archive.org *Sent:* Saturday, November 24, 2012 7:14 AM *Subject:* Re: [ol-discuss] Anonymous edits disabled There was a big bunch of recent spam. - kc On 11/24/12 7:04 AM, Roger Loran Bailey wrote: Why? On 11/23/2012 11:30 PM, Anand Chitipothu wrote: Hi, We've disabled anonymous edits on Open Library. Now only logged in users can edit or add records to Open Library. Anand ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Project Gutenberg editions
On 10/27/12 12:06 PM, Ben Companjen wrote: Hi all, Since I received my e-book reader a couple of weeks ago, I have been looking at out-of-copyright books to load. The few books that I downloaded as EPUB from the OL / Internet Archive contain many OCR errors. Rather than correcting these by hand just for myself (as OL/IA doesn't provide an obvious way to let me upload a more correct version), I remembered that there is a web place where people gather to improve texts for e-book readers and re-discovered Project Gutenberg [1]. Community members involved with Project Gutenberg produce e-book versions of out-of-copyright books, which can then be downloaded from the website. But whereas OL EPUBs can be linked to a specific edition, the PG EPUBs are mostly reconstructed from the text and harder to link to a paper edition. Hence my following questions: Do people agree that Project Gutenberg editions be seen as separate editions? Yes, definitely. I also think that a corrected OL edition should be stored separately from its original un-corrected OCR. The reason is that at some point it may be desirable to go back and see what was there before the correction. Ideally, there could be versioning and forking, much like software. Do people agree the release date given by the project is the publish date? The release date of the digital edition is a publish date, but I think that it isn't sufficient. If the text is derived from a physical book, then the date of the book is also needed. I also would like to see original dates where known -- that is the original publication date of the text. Otherwise, Moby Dick and Origin of Species end up being presented as 21st century texts, which really messes up the cultural and scientific context. Do people agree that there is some sense in PG editions' formats being something like E-book or Electronic resource They are electronic resources, but if they are plain text I have a hard time seeing them as ebooks -- to me, ebook implies something more structured than plain text. (Title pages, navigable chapters, etc.) I know not everyone sees it that way. Why are there only (19 | less than 19 | 281) of the 4+ editions [2] in OL? These 19 seem to be linked to IA items, coming from European libraries, although not all seem to be really published by PG (e.g. [3]). In the latest data dump, there are 281 editions with at least one PG identifier, but they are not listed under publisher PG. Are there people around who know about connecting or importing the PG catalogue? I believe that the PG books are not in the OL/IA workflow for a reason, although I don't recall the reason. It may have to do with the availability of bibliographic data? Note, though, that from what I understand there is no new development happening on OL at the moment and I don't know if it will be taken up again. There seems to be no staff dedicated to the project. So it's unlikely that any new data types will be added. kc Are there other known publishers named Project Gutenberg? (Feel free to answer a subset of these questions :) ) Ben [1] http://www.gutenberg.org [2] http://openlibrary.org/publishers/Project_Gutenberg [3] http://openlibrary.org/books/OL20478553M/The_Lady_of_the_Lake ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Can we manage editions yet?
Wow, that was unexpected. I changed the edition that I had listed to have the correct Work title, it changed the Work title for all of them. Ben, that's probably what you meant about it saving the unchanged Work as well? I'll wait and see if it gets sorted out, but you can look at the versions here: http://openlibrary.org/works/OL72908W/Miko._It_was_me_Mom!?m=history I looked up the Miko books in WorldCat and they all start with Miko. then are followed by the edition name. http://www.worldcat.org/search?qt=worldcat_org_allq=miko That will be because that's how the titles are on the books themselves. kc On 7/24/12 3:19 PM, Karen Coyle wrote: Ben, I'm sure they can't be moved in a kind of mechanical sense, but have you tried changing the underlying edition data? The question then is whether they get re-evaluated for Work belonging based on the new data. If there is no re-evaluation process based on updates, then my suggestion won't work. I do recall that we've talked about this or something very like this in the past, but I don't remember the outcome. kc p.s. I'll try a few and see if anything happens. On 7/24/12 12:26 PM, Ben Companjen wrote: As far as I know, editions that have a work cannot be moved to another work using the edit form. If my understanding is correct, you would need a script and API access to correct it. As a side note: I noticed that (sometimes?) when you update information about the edition (on the 'edition' tab), the unchanged work is saved too. Only editions without a work will be assigned a new work when edited. And I don't think editions (or works) can be merged by normal users yet. Regards, Ben On 24 July 2012 18:57, Sarah Breau smbr...@hotmail.com wrote: I recall there was some discussion here a few months ago about managing editions. I ask because I was working on a book that is in a series, and for some reason all the books in the series are grouped together as one work. I need to split them into separate works, but can't figure out how to do it. Also, can we merge duplicate editions yet? Sarah ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Series: how do people enter them?
I don't remember exactly why series wasn't included, but one issue is that there isn't a readily available Dublin Core field for series. We could use the one from BIBO: http://purl.org/ontology/bibo/Series That property appears to be pretty loose, taking a single literal, so it should accommodate the OL series info. kc On 6/19/12 1:50 AM, Ben Companjen wrote: On 18 June 2012 23:17, Alan Millargrunthos...@yahoo.com wrote: In the edit form publisher names, publish places and possibly other fields (including series) are split on semicolons, so for humans that should work. For bots using the API, they can simply be added as list values. The series field does not appear to do that, at least based on the form. See for example http://openlibrary.org/books/OL7418657M/Call_To_Arms http://openlibrary.org/books/OL7418657M.json It shows a single field with the embedded semicolon. Hmm, that's too bad. I just tried to resave it (with some extra information, otherwise it wouldn't be saved at all (which is nice!)), but that didn't help. By the way, it was AMillarBot that string-joined the two series and put it in one list item, instead of adding one item for each series. Could you perhaps change that? (I notice the rdf format doesn't show the series or a number of other things. I presume it is a strictly-defined subset of the data?) Ah, the RDF - the reason I joined this list ;) My first thread [1], and second [2] on this topic have not really led to what I would have liked to see [3] (still open for comments :-)). It's created by a template that reads from the database, so it could include any and all information we like it to. The hybrid HTML/Python template for Edition RDF is at [4]. One would need a suitable RDF property to add to the template so that the semantics of the field are aligned to the definition of the property. For series, we would still need a definition. (In RDF, the nicest thing to do is to have a URI to refer to. If there is a URI for a series, that is much clearer than a varying series title. It would need a change in the types schema, though. Same goes for publishers, places, formats, etc.) Is there a schema to examine somewhere, which says which way the series field should go? Or is this one of those things where the API can make anything an array/list? http://openlibrary.org/type is the list of types in OL: /type/edition for Editions (which has an array/list field for series), /type/work for Works, etc. As far as I know, there is no explicit description of what contents should go in what field (there is only the type, mostly strings). I would like that. - Alan Ben [1] http://www.mail-archive.com/ol-tech@archive.org/msg00478.html [2] http://www.mail-archive.com/ol-tech@archive.org/msg00556.html [3] https://github.com/internetarchive/openlibrary/pull/136 [4] https://github.com/internetarchive/openlibrary/blob/master/openlibrary/plugins/openlibrary/templates/type/edition/rdf.html ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Series: how do people enter them?
Series are admittedly a mess. Part of the mess has to do with how one defines a series, part has to do with different ways of representing series in the metadata. In library data, a series is generally only recorded when it is a formal numbered series, and those are of the form: $a Author $t Title $v Number (most are just title and number) I think these get brought into OL as: - Series title, number - Great books, v. 23 Only recently have some libraries started to create series entries for things like Harry Potter and other groups of books that have common themes. These are given the same formatting as formal series. Amazon unfortunately puts the series in the title field, in parentheses, so these don't get into OL as series at all: - Understanding Literature (Scribner literature series) AND they mess up the titles and keep the books from merging. (AAArgh!) If there are multiple series, it seems that they should be in different series fields. I personally would prefer that the series title and number be in separate fields (like OL has done for numbering in tables of contents) because that way it would be easier to create a page for the entire series, given that the series titles would be identical in that case, and they aren't when the number is included in the series field. It would be great to have a page for the Harry Potter series or the Discworld series or the Great Books series. Since that isn't how it is, I would probably opt for - series title, number with number including the designation, such as v. for volume or part if that's what is on the item. I agree that some degree of helpful suggestion would be good, and if not heavy-handed wouldn't turn off the folks who hate being told what to do. kc On 6/17/12 2:23 PM, Ben Companjen wrote: Hi, How do people enter series and (optional) part numbers? I hope there is a standard way of entering this information (you need to be in Librarian mode to be able to see the series field), but I've seen several ways: - The Story of Civilization, Part III is the example. - Series name -- number is in many Canadian microform editions. - Harry Potter (5) and similar is another form. The example is simple and does not provide guidance on what to do when there is a comma in the title or whether to add Part or stick to a number. And books may be part of several series - the /type/edition supports multiple series - but what delimiter should be used to enter multiple series+parts? I know there are no rules for this (maybe there's an exception for the delimiter), but it sometimes makes it difficult to decipher the meaning from the contents of records. Ben ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Want to help merge %en% authors? was: Want to help merge United States authors?
Well, it's possible that you shouldn't listen to me :-) because I just looked at the code and it does indeed include the 110 as an author -- whereas my memory was that we didn't include the corporations as authors because most non-librarians wouldn't think of them as authors. So much for my memory. However, I wish we HAD moved the corporations to a field other than author. kc On 5/19/12 8:01 AM, Ben Companjen wrote: On 19 May 2012 16:11, Karen Coylekco...@kcoyle.net wrote: I took a look at some of these. Many are corporate authors, and I saw some where the author names are identical. It isn't clear to me why these didn't get auto-merged. Maybe it's worth having one of the programmers take a look at this before we hand merge them all. Since most duplicates were last edited by ImportBot in 2008, I get the impression that maybe ImportBot wasn't capable of checking whether the author was already in the database. Or, but this is speculation, it's a long-running April Fool's joke (seeing that many were last edited April 1st, 2008). ;) It's also entirely unclear to me why we ended up with these in authors when they are coded as a corporate authors: 110 00 $aPUNJAB. GOVERNMENT 110 1 $aWest Sussex (England).$bCounty Planning Department. I thought that corporate authors were moved to collaborator rather than the author field. So I'm thinking that something went wrong with the loading. The two above are from Toronto library and Talis, so it's not just one source. To me, it's getting more and more obvious that that *may* never have happened. At least it seems it didn't happen during imports in 2008. And without written cataloguing rules and so many examples of corporate identities in the author fields and records, you have been the only source for me saying that they should go in the contributors. No offense to you, you're a great source :) On a perhaps related note: http://openlibrary.org/authors/OL2734614A/Various_Authors :) Ben kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] manual editions joining (again sorry)
I dont' think there's a way to merge them manually, but the two differences in the editions that could be keeping them apart (and it probably takes both of these to do that) are: title: Diseases of the intestines; v. title: Diseases of the intestines; subtitle: a textbook for practitioners and students of medicine and publisher: W. Wood and company v. publisher: William Wood, The publisher counts for less than the title, so adding the subtitle to the shorter one may be enough for them to be considered the same work. So I'd recommend editing the editions (and I'll not touch them myself so we don't interfere with each other), and then wait a day or so (I don't know if that's enough) and see if they come together in the work. kc On 5/9/12 11:47 PM, r...@ark.in-berlin.de wrote: Karen, what triggered the complaint was this import: http://openlibrary.org/works/OL16621408W/Diseases_of_the_intestines?m=history but we had already http://openlibrary.org/works/OL7882426W/Diseases_of_the_intestines I want to merge them manually. Is this possible? Regards, ralf On Wed, May 09, 2012 at 11:00:14AM -0700, Karen Coyle wrote: Ralf, can you send an example? Publisher name should affect combining editions but not linking editions to works, so I should look at this. kc On 5/9/12 9:00 AM, r...@ark.in-berlin.de wrote: May I ask if there is some progress in the possibility to join editions manually into one work? Even now, new works are added with the same author/title/year, where just a different form of the publisher's name causes it to be occupying a different slot. Regards, ralf PS: My condolences for having chosen PHP as a language. Maybe it's what makes the trees grow beyond reach? http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/ ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] manual editions joining (again sorry)
Ralf, the Work on OL tries to follow the FRBR definition of Work [1]. So different years, even different translations, are the same work if the text expresses the same content. Of course, making a bright line distinction on Work is difficult, but some aspects of it are not difficult. It is the same Work if the author and title are the same, or, in the case of translations, if the author and original title are the same. I think of a Work as what we might discuss if you read Thomas Mann in German and I read it in English, but we could talk about Magic Mountain and what we thought of it even so. However, a movie made from the book would be a different work (and as we know, movies that are faithful to the text are rare if ever, although they use some of the concepts from the book). It gets more complicated with movies and music, but with books it's easier. kc [1] http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records On 5/10/12 9:28 AM, r...@ark.in-berlin.de wrote: Many thanks for your work. I see now you're considering works identical iff it's an edition of the same year, ie, the 1900 and 1904 edition are different works, and much more, the original (german language) edition and its translation. Is this correct? Short of a manual version of merge, at least it should be clear when to expect an automagic merge, and I know of no documentation where I could find this (ah, where is that bot source, by the way? buried in the OL source presumably) Regards, ralf On Thu, May 10, 2012 at 09:04:21AM -0700, Karen Coyle wrote: I dont' think there's a way to merge them manually, but the two differences in the editions that could be keeping them apart (and it probably takes both of these to do that) are: title: Diseases of the intestines; v. title: Diseases of the intestines; subtitle: a textbook for practitioners and students of medicine and publisher: W. Wood and company v. publisher: William Wood, The publisher counts for less than the title, so adding the subtitle to the shorter one may be enough for them to be considered the same work. So I'd recommend editing the editions (and I'll not touch them myself so we don't interfere with each other), and then wait a day or so (I don't know if that's enough) and see if they come together in the work. kc On 5/9/12 11:47 PM, r...@ark.in-berlin.de wrote: Karen, what triggered the complaint was this import: http://openlibrary.org/works/OL16621408W/Diseases_of_the_intestines?m=history but we had already http://openlibrary.org/works/OL7882426W/Diseases_of_the_intestines I want to merge them manually. Is this possible? Regards, ralf On Wed, May 09, 2012 at 11:00:14AM -0700, Karen Coyle wrote: Ralf, can you send an example? Publisher name should affect combining editions but not linking editions to works, so I should look at this. kc On 5/9/12 9:00 AM, r...@ark.in-berlin.de wrote: May I ask if there is some progress in the possibility to join editions manually into one work? Even now, new works are added with the same author/title/year, where just a different form of the publisher's name causes it to be occupying a different slot. Regards, ralf PS: My condolences for having chosen PHP as a language. Maybe it's what makes the trees grow beyond reach? http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/ ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] manual editions joining (again sorry)
Ralf, can you send an example? Publisher name should affect combining editions but not linking editions to works, so I should look at this. kc On 5/9/12 9:00 AM, r...@ark.in-berlin.de wrote: May I ask if there is some progress in the possibility to join editions manually into one work? Even now, new works are added with the same author/title/year, where just a different form of the publisher's name causes it to be occupying a different slot. Regards, ralf PS: My condolences for having chosen PHP as a language. Maybe it's what makes the trees grow beyond reach? http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/ ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Want to merge authors? Try Shirley Institute/Wira Joint Conference (1977 Manchester, Eng.)
OK, now I get that too. No idea what I did different before but... nevermind. This has got to be a bug. The same item has been entered who knows how many times, and at least some of the IDs are consecutive. http://openlibrary.org/authors/OL4602522A http://openlibrary.org/authors/OL4602523A http://openlibrary.org/authors/OL4602524A Ditto the European law one. http://openlibrary.org/authors/OL4791619A http://openlibrary.org/authors/OL4791620A etc. I think there was another case like this before. kc On 5/9/12 2:57 PM, Ben Companjen wrote: I am, yes. I loaded the ~6.9 million author records from April's dump into MySQL, did a GROUP BY slug (where slug is the author name in lower case, without spaces and punctuation) and got shirleyinstitute/wirajointconference1977manchestereng: 10047. I then searched for Shirley institute 1977 as an author on the website and got 10,047 hits. And I still do: http://openlibrary.org/search/authors?q=shirley+institute+1977 Second in the list of slugs is colloquyoneuropeanlaw1981messinaitaly: 2368 http://openlibrary.org/search/authors?q=colloquy+1981+messina Ben On 9 May 2012 23:44, Karen Coylekco...@kcoyle.net wrote: This is rather odd. When I look up Shirley institute as an author and find the 1977 joint conference I get 2 work titles, each that has only 1 edition. Ben, are you working with the dump? kc On 5/9/12 6:05 AM, Ben Companjen wrote: Hi, Although I found 341 duplicates of President Clinton a lot yesterday, there is still the author that goes by the name Shirley Institute/Wira Joint Conference (1977 Manchester, Eng.). There are a whopping 10,047 authors with that name! Merging those manually is only for those who desperately need an extremely boring task :) Looking at the subject and book titles in the search results, I think one MARC record was imported many times without duplicate detection, so merging the authors would still leave some 1 duplicate works/editions. Any idea how to best solve this? Ben ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Merging editions
I agree that these four look like they should have been merged. We'll need to find out from the merge master Edward Betts if he can understand what happened here, and if there is a merge-bot running that picks up changes. Edward? kc On 2/29/12 9:07 AM, Ben Companjen wrote: Are editions merged automatically? I just found http://openlibrary.org/works/OL13847301W/Conceptual_modeling which has four editions that are exactly the same (except for the last one which has an additional GoodReads ID). The last edit on all of the editions a work merge, performed in August 2010. So I think it may not a matter of waiting before a bot (or human) merges the editions. Ben On 29 February 2012 02:34, Karen Coylekco...@kcoyle.net wrote: Dimensions aren't part of the algorithm. They vary a lot -- not because the books change size and shape, but because they can be in either cm or inches, some are just height, some are hxwxd -- there are just too many variables! So let's see if this change results in a merge... kc On 2/28/12 3:46 PM, Laurence Penney wrote: Thanks for this. I’ve just done the edit. Since I had the book in front of me I also added dimensions. Now I notice the other edition already has a different set of dimensions. Hope this doesn’t prevent the merge! - L On 28 Feb 2012, at 23:36, Karen Coyle wrote: I'll let you try this, so we don't create confusion, but try changing the date on the one from 2002 to 2004, and wait a bit (I don't know how long... maybe a day?) and see if they merge. They have the same ISBNs and same titles, so it looks to me like the date difference might be keeping them apart. kc On 2/28/12 2:27 PM, Laurence Penney wrote: A tweet[1] inspired me to recommend[2] Open Library as the best place to link when talking about a book — instead of Amazon or its publisher’s site. However the book I was going to choose as a nice example for the original poster has duplicated edition info, and I cannot see a way to merge them. I’ve confirmed with the author there’s just one edition (2004). So can I merge or is it a superuser thing? http://openlibrary.org/books/OL22593796M/Dutch_type http://openlibrary.org/books/OL9109013M/Dutch_Type - L [1] https://twitter.com/#!/typotheque/status/174589813347979264 [2] https://twitter.com/#!/Lorp/status/174595628993748992 ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Merging editions
Dimensions aren't part of the algorithm. They vary a lot -- not because the books change size and shape, but because they can be in either cm or inches, some are just height, some are hxwxd -- there are just too many variables! So let's see if this change results in a merge... kc On 2/28/12 3:46 PM, Laurence Penney wrote: Thanks for this. I’ve just done the edit. Since I had the book in front of me I also added dimensions. Now I notice the other edition already has a different set of dimensions. Hope this doesn’t prevent the merge! - L On 28 Feb 2012, at 23:36, Karen Coyle wrote: I'll let you try this, so we don't create confusion, but try changing the date on the one from 2002 to 2004, and wait a bit (I don't know how long... maybe a day?) and see if they merge. They have the same ISBNs and same titles, so it looks to me like the date difference might be keeping them apart. kc On 2/28/12 2:27 PM, Laurence Penney wrote: A tweet[1] inspired me to recommend[2] Open Library as the best place to link when talking about a book — instead of Amazon or its publisher’s site. However the book I was going to choose as a nice example for the original poster has duplicated edition info, and I cannot see a way to merge them. I’ve confirmed with the author there’s just one edition (2004). So can I merge or is it a superuser thing? http://openlibrary.org/books/OL22593796M/Dutch_type http://openlibrary.org/books/OL9109013M/Dutch_Type - L [1] https://twitter.com/#!/typotheque/status/174589813347979264 [2] https://twitter.com/#!/Lorp/status/174595628993748992 ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Project Gutenberg metadata
On 2/11/12 1:45 PM, Lars Aronsson wrote: On 02/11/2012 06:59 PM, Lee Passey wrote: I have an ugly RDF file of metadata from Project Gutenberg (actually, ugly RDF is redundant) that I would like to bulk load into Open Library. I have very little free time to do anything about this. Is there anyone out there with experience with [shudder] RDF and bulk upload who could help me with this project? Which is the OpenLibrary's preferred format for metadata from a book scanning project? I believe that most scanned books are paired with a library catalog record, which means that the format is MARC. However, the Open Library has also taken in data from Amazon. There are already some bug reports about ideas to connect OL to PG works. kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Is the ISBN search not working?
I just got a good result on these: 0805088113 978-0805088113 Could you give some of the ISBNs that have failed for you? kc On 2/11/12 9:27 AM, Roger Loran Bailey wrote: Every time I try searching by ISBN I get no results and have to do the search again by title. Might that feature be broken? ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Author Merge (and other merges)
Quoting rp braunov rp.brau...@googlemail.com: Another problem I've seen is, that it's not clear how to write, e.g. in kyrillic letters if the original title is in that language or in transcribed latin letters? Example: There is a book taken from the library of congress ( http://openlibrary.org/works/OL16234335W/%D0%9D%D0%B0%D1%80%D0%BE%D0%B4%D1%8B_%D0%90%D1%84%D1%80%D0%B8%D0%BA%D0%B8) where the title was transcribed into latin letters. I've changed this into kyrillic. rp, Before Unicode was available, libraries in the US transliterated all non-latin scripts into a standard latin-ized equivalent. That is why you see the latin letters in the title. It would be good to keep both versions in the record, perhaps with the latin-ized one as an added title since some people are accustomed to search for them in that way. Also, since many searchers will have latin-only keyboards, they can do a search for Voina i mir but they could not search on the kyrillic characters. Next problem: the German edition is printed in 2 volumes. I've added them as two editions of that one book?! Not very good/logic, I think. Works printed in multiple volumes should be a single edition. In the pagination field you can put 2 v. Here's an example: http://openlibrary.org/books/OL14017106M/Krieg_und_Frieden That way the edition page represents the whole thing. There are still questions about how to handle tables of contents that go across volumes, but it's a known problem. kc I hope for some clarification and help. Kind regards R. P. Braunov -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] W3C LLD Call for Review
What follows is the call for review of the W3C library linked data incubator group draft final report. Although I know everyone is very busy, I encourage you to take a look at this report and give comments and suggestions. Commenting is very easy in the blog, listed below. * [apologies for cross-posting] -- W3C Library Linked Data Incubator Group CALL FOR PUBLIC COMMENT The W3C Library Linked Data Incubator Group (http://www.w3.org/2005/Incubator/lld/) has been chartered from May 2010 through August 2011 to prepare a series of reports on the existing and potential use of Linked Data technology for publishing library data. The group is currently preparing: -- A report http://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion which consists of Benefits Vocabularies and Datasets Relevant Technologies Implementation challenges Recommendations -- Use Cases, a survey report describing existing projects http://www.w3.org/2005/Incubator/lld/wiki/UseCaseReport -- Vocabularies and Datasets, a survey report http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset We (LLD XG) invite comments from interested members of the public. Feedback can sent as comments to individual sections posted on our dedicated blog http://blogs.ukoln.ac.uk/w3clld/ or by email to a public mailing list (public-...@w3.org, archived at http://lists.w3.org/Archives/Public/public-lld/ ) using descriptive subject lines such as '[COMMENTS] Benefits section' Comments will be especially welcome in the next four weeks (through 22 July). Reviewers should note that as with Wikipedia, the text may be revised and corrected by its editors in response to comments at any time, but that earlier versions of a document may be viewed by clicking on the History tab. It is anticipated that the three reports will be published in final form by 31 August. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Where do subjects come from?
I know only part of the answer, but in the spirit of part being better than none... see below Quoting Tom Morris tfmor...@gmail.com: Where do Open Library subject terms come from? Is there any correspondence with the Library of Congress Subject Headings or any other, so called, controlled terminology set? Yes, and no. SOME of the subjects come from LCSH subjects that were in records ingested into OL. Those subject headings were broken up based on their subfielding, which is how OL got the initial list of subject types: subject = LCSH $a or $x place = LCSH 651 $a or other 6xx $z time = LCSH $y person = LCSH 600 $a $c $d (If you are familiar with the work done by OCLC on FAST, this is similar but doesn't use OCLC's algorithm. http://www.oclc.org/research/activities/fast/) In addition, some of the inverse terms were righted so Cookery, French (although I made that up) becomes French Cookery. It appears that multiple sources might have been added together because there are, for example, http://openlibrary.org/subjects/dove_(ship) http://openlibrary.org/subjects/dove_(sloop_:_lapworth) Some data comes in from Amazon so will have BISAC subject headings that overlap with LCSH headings. In addition, anyone can add a subject to any of the subject fields when adding or editing records. which both refer to the same concept. There are also things mixed in which aren't subjects at all, or at least aren't used as subjects, like Accessible book and Protected DAISY. These appear to be format specifiers. Yes, there are also format specifiers that have been added, kind of in the spirit of no holds barred tagging. :-) If you've been working with metadata for a while I'm sure you've been involved in discussions of when is a type really a subject? Users come into libraries looking for chick lit or DVDs with no idea whether they are asking for a literary genre, a subject, or a format. There are arguments for mushing everything together because users don't know where to look, and other equally valid (IMO) arguments for maintaining some separation between subjects, genres and formats. I guess if there were an obvious right answer we'd all be doing the same thing and not having these discussions. kc Is there a way to trace subject headings back to their source? Are there any other types of relationships between subjects other than just related (ie co-citation), for example, broader or narrower? What's the difference between a subject and a place or person? Things often appear in two of the three categories e.g. subject place, which makes me wonder if they're distinct lists (and why). I guess what I'm really asking for documentation describing how all this hangs together in the OL context. I looked at the FAQ and http://openlibrary.org/subjects without finding anything illuminating. Tom ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] OCLC data going open source?
First, OCLC has not yet decided whether to allow data to be exported with ODC-BY. It's something they are thinking about. (They also mentioned this over a year ago at another meeting.) Second, as I understand it, this is an option for libraries whose catalogs contain WorldCat records, as defined in the OCLC record use policy. It would allow libraries to share their catalog data under the ODC-BY license. If one wants data directly from WC then that is a contract negotiation with OCLC. (OCLC does provide data to Google through such a contract, for example.) I have not heard that OCLC would release the entire (or even portions of) the WC database. The relevant info (and more was said during the presentation, but I can't find that online) is slide 26 [1]: Our preliminary thoughts on Open Data Licensing ?We are considering recommending ODC-BY ?Distinguishes between the database and its contents (or portions of contents) ?[Member DATABASE NAME] would be the name of the member?s or group?s catalog, and the member or group = the licensor ?License notice wording in accordance with instructions in ODC licenses ?Still under investigation?your input invited and welcome kc [1] http://www.oclc.org/multimedia/2011/files/globalcouncil/Buzash_Calhoun_Dunsire_Linked_Data.pdf Quoting Alex Stinson sad...@gmail.com: Hey all, This was brought to my attention recently on a Wikipedia mailing list: http://everybodyslibraries.com/2011/05/24/open-datas-role-in-transforming-our-bibliographic-framework/ Has open library been part of opening up the Worldcat stuff? Are we thinking about incorporating that data into our database? Alex User:Sadads -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Serials/Journals Cataloging
I spent most of my in-library career just praying that I'd never have to catalog serials! There are a number of different options/directions. If you are mainly interested in articles, then there are some fairly tried-and-true data elements that make up an article citation. In a way, you treat each article like a separate publication that just happens to need to be located within a journal with an issue, number and starting page number. The library approach to serials is more complex and may not be of interest to you. In library cataloging, the articles are not included, only the serial publications themselves. What gets complex about this is keeping track of the history of the serial: how it has changed name or publisher over many years; changes in ISBN (which is supposed to change when the title changes); changes in the name of the organization that publishes the journal; etc. What is tricky for OL I think is that scanning and digitizing takes place usually on bound journal volumes, and the article information isn't there. Then you have the issue of trying to link up an article (which is generally what people are looking for, not a whole volume) with the particular digitized volume. It gets worse when the physical, bound volume doesn't correspond to the numbered volume (like for a really thick journal where it's too big for an entire numbered volume to fit into a bound volume that then gets digitized). I think there are ways to simplify the problem and you definitely do NOT want to try to do it the way libraries do, which is overly complex. The experience that people have with the OpenURL (which is a way of linking articles to the journals themselves) can probably come in handy. All that said, a place to start would be: what is going to be the source of the metadata? There are huge databases of journal article citations. If you want to start bringing in that data, then you could begin by analyzing how those records might link to journals that the Archive has scanned. kc Quoting George Oates g...@archive.org: Hi all, There's a slim chance I asked this list about serials cataloging last time I poked at it, but, I've started looking at it again and wondered if you could help. The time has come to seriously consider adding both multi-volume works and serials cataloging to Open Library, and knowing that it's a notorious cataloging issue, I wanted to reach out to learn from your experience and insight on this issue. I've been doing a survey of somewhat random journals (ones I like + science-y + literary + bookshop visual scans) to see if I can isolate any consistent fields. Along with the recent Minimum Viable Record post I did on the OL blog [1], I'm searching for a minimum set of fields we could enlist to describe serials somewhat generically. Seems like there are: Title, Date, Volume, Issue, but non of these are used consistently, as I'm sure you're aware. I've looked at the LoC Serials Cataloging Issues site [2], which is uber technical. I did however find a useful Catalogers Cheat Sheet (PDF) which was a good overview of MARC handling stuff [3]. So, I'm wondering if any of you happen to be serials cataloger, or know of anyone who might be interested to talk with Open Library about how we might do this well... Or, if you know any useful web-based resources, I'd love links! Cheers, george [1] http://blog.openlibrary.org/2011/04/11/minimum-viable-record/ [2] http://www.loc.gov/acq/conser/issues.html [3] PDF: http://www.loc.gov/acq/conser/pdf/CheatSheetforCSR.pdf ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] List of Contents - layout
I think you've gotten too many bars in the markup. I removed the bar from the beginning of of one of the lines (Parentage) and it moved over. The examples above do not have a bar after the asterisks, so I think removing that first bar may do the trick. kc Quoting Alan Merryweather iopa...@virgin.net: Would someone please have a look at http://openlibrary.org/works/OL415033W/Berlioz and note the bar** on RH side. Maybe the bar on LH side is not needed, though using it reflects the layout of the printed Index. This is the third index I've entered; the others were lengthy and worked OK but were less complex. TKS ** its name and keystroke please. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Problem with common names
Quoting Sarah Breau smbr...@hotmail.com: Patrick, my workaround is to edit the author name by adding something unique to it (e.g. change it to David Clarke this is the one), then wait for the database to update, then go into the work and change the author to David Clarke this is the one then wait for the database to update, then go back into the author record and change the author name back. It takes FOREVER but it works every time. ;-) Wow. Karen, I don't see how your workaround will work. OL only shows a certain number of authors in the author drop-down field, and if there are dozens with your author's name and your author only wrote one or two books, your author will not show up on that list. So it sounds like there needs to be a way to expand that list, a kind of more... link. Or, as Patrick noted, to be able to input the actual author ID. (At which point the system might be able to show you the display form so you can tell you got it right.) I had mistakenly thought the issue was about author merging. Sorry. Merging is imperfect because when you do an author merge, OL does not change the author field in all the works to the same author record, it just groups author records together. This has caused me problems. It is better to empty out all the author records first by switching all the works to the correct author, and then do the merge. I'm trying to picture this. I can imagine that different works for the same author point to different author entries... but it seems that when those author entries are merged, that merge should affect all works. Nope, I can't get a mental image of this one. I hope someone who knows the answer posts! kc Of course, this would not work in Patrick's case because there are too many author records to select the one he wants in the author field... I would say that after the inability to merge works, this is the biggest problem I have in editing on OL. Sarah -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Wikipedia Citation option
Some author names in last,first order could be retrieved from the original data when library data was the source of the OL record. In fact, it would be great to connect the OL person names to the Virtual International Authority File (http://viaf.org) because that would also provide display forms from other library communities should OL begin to receive records from non-US library sources. Of course, it was also be great if VIAF could link up to OL, Amazon, DBPedia, and LibraryThing for popular but unofficial forms of author names. There are a number of projects that are gathering names of academic researchers/authors, mainly from academic journals. Here's one: http://people.bibkn.org/ Since some of these people also write books, that would be an interesting connection. kc Quoting Alan Millar amillar...@gmail.com: Splitting a single name into first names and surname is doable, but not entirely trivial. And then there are the ones like Dr. John van Smith, Jr. , Ph.D. I've seen several different orderings of last-name-first for names like this. I don't know if grammarians or librarians even agree on it, let alone programmers :-) And the authors list in OL has some fun variations, from trying to straighten them back out. My favorite is when it has a role tacked on like editor in chief to top it off... - Alan ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Problem with common names
Have you considered merging the disparate author entries for the David Clarke 1972-? If you bring the author entries together, I believe that the bibliographic record will be updated automatically (although with a bit of a delay -- at least a few minutes). If you click on author on the home page and type in david clarke you can see all of the names matching those 'keywords', among which will be the ones you wish to merge. (Merge link is in upper right side of results page.) kc Quoting Patrick Conley p...@conley.de: I wrote a mail to add the book http://openlibrary.org/works/OL8095745W/Diese_merkwürdige_Kleinigkeit_einer_Vision to the author David Clarke (1972 -) http://openlibrary.org/authors/OL6890022A/David_Clarke Apparently it's not possible, even for OL staff, because Clarke is a common name (too many hits). In the old OL version you could add the exact author, using his OL number. Is it possible to restore the old functionality? -patrick -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Book ID harmonisation
Quoting Richard Light rich...@light.demon.co.uk: In message AANLkTi=s58otwk+-pi5q-o8mmnjfxsm3tnglxosdu...@mail.gmail.com, Frankie Roberto fran...@frankieroberto.com writes I see a links tab on the Work edit page that lets you add links with a URL and a display string. Is that what you need? Not quite - a URL isn't always the same as an identifier (eg a LibraryThing id). I've used this to add a few Wikipedia URLs though. Yes, ideally we would also want a place to add the corresponding dbpedia identifiers. Wouldn't many of these identifiers be resolvable URIs? Although I, too, am reluctant to mix identifiers and locators, it seems to be inevitable since so many are both (wikipedia page URLs/URIs and OL URLs/URIs for example). But you are right, there are many identifiers that aren't in an http resolvable format (ISBN notably, also OCLC number, although these are not at the work level), so it does make sense in the meanwhile to have an ID field available for Works. In fact, at some point we should start seeing ISTC (int'l std. text code) IDs -- although they will fall somewhere between the Work and the Edition (by OL's definition), so that may prompt the creation of the Expression layer in OL. (Although I personally find the expression entity to be problematically ambiguous.) kc Richard -- Richard Light ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Can/should DVD's be listed on Open Library?
By DVDs, do you mean films on DVD? Your question brings up an idea... of linking OL books to IMDB (or another movie database) pages for movies made from books. kc Quoting Stuart Fanning stuart.fann...@ntlworld.com: I ask this as I recently noticed that most DVD's have ISBN's. I know Audiobooks are already listed so what is the policy re DVD's? Stuart -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
[ol-discuss] Principles for Open Bibliographic Data
** apologies for cross-postings ** The Open Bibliographic Data Working Group of the Open Knowledge Foundation has published a set of principles for open bibliographic data. [1] These principles express a philosophy of openness for bibliographic data in support of research and knowledge enhancement: For society to reap the full benefits from bibliographic endeavours, it is imperative that bibliographic data be made open - that is, available for anyone to use and re-use freely for any purpose. It is hoped that these principles can become a rallying point for the ongoing discussion in the knowledge community about the ownership and use of this key data. You are invited to visit the OpenBiblio pages [2], read the principles [1], and add your name in support of bibliographic data that is unfettered by proprietary claims [3]. Personal endorsements are welcome, as are endorsements representing institutions or organizations. If you are representing a larger body, please note that in the comments area. Note also that the Working Group is interesting in hosting translations of the principles. If you can provide a translation, please contact openbiblio [at] okfn [dot] org. [1] http://openbiblio.net/principles/ [2] http://openbiblio.net [3] http://openbiblio.net/principles/endorse/ -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Principles for Open Bibliographic Data
There is hope on the part of OKFN that Open Library will become a signer, but I don't think that discussion has taken place yet. kc Quoting Frank Lovelace frank.lovel...@gmail.com: On 3/2/2011 12:40 PM, Karen Coyle wrote: ** apologies for cross-postings ** The Open Bibliographic Data Working Group of the Open Knowledge Foundation has published a set of principles for open bibliographic data. [1] These principles express a philosophy of openness for bibliographic data in support of research and knowledge enhancement: For society to reap the full benefits from bibliographic endeavours, it is imperative that bibliographic data be made open - that is, available for anyone to use and re-use freely for any purpose. Does Open Library have any position on this? Last I heard they did not want to compete with WorldCat. ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Author name
Quoting Morten Juhl-Johansen Zölde-Fejér mj...@syntaktisk.dk: I am now merging some various author variations of Gogol, but: Should some kind of detail not reflect the particular spelling applied in the individual book? In library data, the odd field called statement of responsibility transcribes the author exactly as it appears on the title page of the book. So if you are making changes on an edition and turn on Librarian Mode, at the bottom you will see a box for a By statement e.g. by Nikolai Gogol [1]. That's where you can put the name as it appears on the book, if you wish. kc [1] I couldn't find an example for Gogol, but go to this page: http://openlibrary.org/books/OL1270769M/The_tortilla_curtain/edit and click on turn on librarian mode and you'll see what I mean. Note that it is used by libraries even when the name is the same, but you probably won't want to bother unless the name is different. As I was trained in Russian at university, I personally learned the depths of frustration that transliteration issues can bring one to. There are varying conventions in every country, and often several over the course of time. Please advise? Also, I came across two empty entities as well, http://openlibrary.org/authors/OL4020164A/Nikolaij_Gogol and http://openlibrary.org/authors/OL3374482A/Nicholai_Gogol - these should just be nominated for deletion. Sincerely, Morten __ Morten Juhl-Johansen Zölde-Fejér http://syntaktisk.dk * mj...@syntaktisk.dk ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Merging (or not) corporate author records
Quoting Tom Morris tfmor...@gmail.com: Thanks for the quick reply, Karen. Are you saying that the OpenLibrary policy is to follow what's dictated in the Library of Congress authority file? There would seem to be at least some divergence in that they use a different name order, include birth/death years (sometimes), etc. No, I'm not saying that OL follows LCNA -- you asked for a source to make decisions and the only ones I know of in the area of books are library name authority files. Those files are based on some sensible rules (IMO) so they might be usable as guidance. At the same time, they do some ugly things, like put names in last,first order, which looks very artificial. The nice thing about computers is that they're not limited to having a single name at the top of the 3x5 card. If the record has both Samuel Clemens and Mark Twain, the computer can find it under either one. If they are two separate records, there needs to be a way to link the two (which I don't think OpenLibrary has, does it?). Any author record can have alternate names, and many do. However, OL doesn't currently replicate the see also structure that is in library authority data. I suspect it would be easy enough to add the structure, a bit more tricky to fill in the data. If it is added, it would be nice to see it displayed as: Also writes as... and have the relationship recorded in both/all related records. If the policies aren't set in stone yet though, I'd personally argue for something that's a little more approachable and common sense based. There are two schools of thought on that: 1) some rules are good 2) rules make participation more difficult. Some of us like having rules, some find them constraining. I think this will be a constant struggle. Tom p.s. I don't think VIAF is really usable for OpenLibrary because it doesn't enforce a non-commercial restriction like the OCLC license requires to protect their monopoly. I've heard said that VIAF is hosted at OCLC but does not belong to OCLC. However, the VIAF site page points to the usual legaleze about OCLC research projects that includes the no commercial use bit. That worries me. There is an entry for VIAF on the CKAN Open Bibliographic Data site that quotes Thom Hickey saying that they (the VIAF collective) have not yet decided on license terms: http://ckan.net/package/viaf Unfortunately, that post is now one year old. At the same time, VIAF is designed as linked data -- and if it isn't open for linking, that makes little sense. Of course, the OCLC restrictions don't make sense, either. kc On Sat, Feb 12, 2011 at 8:48 PM, Karen Coyle kco...@kcoyle.net wrote: The only canonical form that I know of is to follow what is in a library name authority file, such as the Library of Congress name file. You can find it at: http://authorities.loc.gov/ If you search on Adobe you see a long list that includes Adobe Creative Team, Adobe Systems, and others. If you look in the left-hand column there you will see a red button that says either: Authorized heading, or References. A reference is a heading that is not preferred, but would instead lead to the Authorized heading: Adobe Systems Inc. -- see: Adobe Systems The format of the data isn't ideal, so it can be difficult sometimes figuring out what goes with what. This database will probably also answer your other question about pseudonyms. Currently, the Anglo-American libraries (US, Canada, UK, Australia) follow a set of rules that record real names and pseudonyms as separate entries. The idea is that each name represents a persona and that most of the time people who are looking for things are aware of the persona (Mark Twain, Lewis Carroll) so this is what they are likely to look for. They may not know the person's real name. If you want to check a wider range of name databases, there is a combined database called the Virtual International Authority File (VIAF) at http://viaf.org Many national libraries contribute to that, and you may find names that aren't available from the Library of Congress. You will also see that in some countries different choices are made about how to record a name. VIAF brings them all together -- and gives them a VIAF identifier which I hope we will be able to use in the future to make the catalog more global. If you don't find what you are looking for in these databases (I didn't find Adobe Development Team) then you have to make a decision on your own. The entries in the database may serve to give you a pattern to follow. Note that there are other communities creating name files, in particular there are some academic projects connecting names from academic article databases to actual persons. I don't have these at my fingertips, but they probably cover the OL books as well as the library name databases do. kc Quoting Tom Morris tfmor...@gmail.com: Below are some
Re: [ol-discuss] New power tool for author merges
Thanks, Tom. I took a quick look at a few pages of your list, and have 2 immediate observations: 1 - there are a LOT of corporate authors, and most of those will not have come in on library records (the ones on library records we moved to the contributor field). This makes me wonder if we don't want to do something consistent here -- moving these to contributor? Or are folks comfortable with corporate names in the author field? 2 - of the ones I looked at, a number have already been merged. (Yeah OL users!). You wouldn't by any chance want to update this list? (she asks sheepishly) kc Quoting Tom Morris tfmor...@gmail.com: I've been totally unsuccessful in getting any of the OpenLibrary staff interested in the list of duplicate authors that I generated last spring, so I've decided to open it up to the community. I've modified my duplicate listing program to automatically generate an OpenLibrary author merge URL with all the duplicate IDs. If you are logged in to OpenLibrary and you click on the URL, it will take you to the author merge dialog page where you can select which authors should be merged, which one should be the master, etc. Please note that this is a *power* tool and should be used with great care. There *are* errors in the listing of duplicates, so you should review carefully the set of authors that are being proposed for merger to make sure it's accurate. I've done the first 50 or so, so you'll want to skip ahead in the list to find some that still need work. I'll see if I can enhance the program to skip authors who have already been processed, but for now if you click a link and end up on a page with just one author (or zero authors), that means someone else already took care of this author. Don't worry about running out of work, there are over 7,000 sets of duplicates (with 20K total records), so there'll be plenty for everyone to work on. Here's the tool: http://ol-dupes.freebaseapps.com/ Play carefully! Tom ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Merging (or not) corporate author records
The only canonical form that I know of is to follow what is in a library name authority file, such as the Library of Congress name file. You can find it at: http://authorities.loc.gov/ If you search on Adobe you see a long list that includes Adobe Creative Team, Adobe Systems, and others. If you look in the left-hand column there you will see a red button that says either: Authorized heading, or References. A reference is a heading that is not preferred, but would instead lead to the Authorized heading: Adobe Systems Inc. -- see: Adobe Systems The format of the data isn't ideal, so it can be difficult sometimes figuring out what goes with what. This database will probably also answer your other question about pseudonyms. Currently, the Anglo-American libraries (US, Canada, UK, Australia) follow a set of rules that record real names and pseudonyms as separate entries. The idea is that each name represents a persona and that most of the time people who are looking for things are aware of the persona (Mark Twain, Lewis Carroll) so this is what they are likely to look for. They may not know the person's real name. If you want to check a wider range of name databases, there is a combined database called the Virtual International Authority File (VIAF) at http://viaf.org Many national libraries contribute to that, and you may find names that aren't available from the Library of Congress. You will also see that in some countries different choices are made about how to record a name. VIAF brings them all together -- and gives them a VIAF identifier which I hope we will be able to use in the future to make the catalog more global. If you don't find what you are looking for in these databases (I didn't find Adobe Development Team) then you have to make a decision on your own. The entries in the database may serve to give you a pattern to follow. Note that there are other communities creating name files, in particular there are some academic projects connecting names from academic article databases to actual persons. I don't have these at my fingertips, but they probably cover the OL books as well as the library name databases do. kc Quoting Tom Morris tfmor...@gmail.com: Below are some examples of potential merges for corporate authors. Is there any place that describes whether or not these should be merged and what the canonical form should be? If there's no standard, do folks want to weigh in on what they think the right thing to do is? Tom Adobe Systems Inc. Adobe Development Team Adobe Creative Team U.S. Government USGPO [ie. US Government Printing Office] Rand McNally and Company. Rand McNally Staff IEEE Vehicular Technology Society IEEE Vehicular Technology Conference Ontario) IEEE Vehicular Technology Conference (48th : 1998 : Ottawa Delorme Publishing Company DeLorme Mapping Company [Corporate name change] Texas Instruments Texas Instruments Engineering Dorling Kindersley Inc. Dorling Kindersley Ltd DK Publishing [Combination of different subsidiaries, plus a name change over time] ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Any way to link to a section of a Work?
I think we've got two types of units being discussed here. One is the actual sections of the book, as defined by the publisher or organizer: chapters, table of contents, index, bibliography... etc. Those should be the same for every instance of the book. The other is a way to bookmark any part of the text in an ad hoc manner. This is NOT universal, and anyone can create a new bookmark to suit their needs. It might help to explore them separately. kc Quoting Mike McCabe mcc...@archive.org: John - We'd love to have something like this! For now, the closest thing we have is our BookReader URLs, which easily allow linking to a specific page. What you're describing sounds a lot like the Open Bookmarks project - http://www.openbookmarks.org/ ... which we're participating in. The goal is to have a way of marking a spot or range of text in any book (edition or work) - no matter what the format or vendor. The New York times allows building links that include highlighted text; Open Bookmarks might end up doing something similar. http://open.blogs.nytimes.com/2011/01/11/emphasis-update-and-source/#h[AltBau,2] (this highlights the second sentence in the paragraph beginning '(A)nchor (l)inks (to)...' and ending '(B)ut (a)s (u)sual...' - thus 'AltBau') Open Bookmarks has an open mailing list - feel free to join! Mike On 2/1/11 1:34 PM, John Diane Sumsion wrote: I've wanted to link to pages of books, or at least down to the chapter, or subhead. Take, for example, the following blog post: http://deliberate-thinking.blogspot.com/2010/04/reality-quotient.html In that post, I refer to printed, copyrighted content by URL, but with lame google books URLs that have nothing to do with the structure of the book, and that border on the potentially problematic situation traditionally called deep linking. Look for the following text in the blog post: - text: Demarco's Total Useful Mental Discriminations (TUMD) and link: http://books.google.com/books?id=563gvssRPvkClpg=PR1dq=slackpg=PA72#v=onepage - text: how capable you are and link: http://books.google.com/books?id=31Qe_e61Y10Clpg=PP1ots=bBbef5O2a3dq=speed%20of%20trustpg=PA185#v=onepage Is there any way that I could (for a given Work, or Edition), add a list of URLs that are just markers for the sections within a book. Not page linking, but sections as defined by the work itself. No content extraction except perhaps to put the subhead text in the URL itself (either in English, or in the Work/Edition's native language, or both). Now there would be a global permalink for a given chunk of a work. The closest think I found was OpenLibrary itself with canonical URLs to Works and Editions, so I thought if anyone knew, you all would. Does anything like that exist? If not, have you thought about allowing your users to define and curate sections for Works/Editions that could be treated as permalinks? John... ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Any way to link to a section of a Work?
Quoting John Diane Sumsion jdsums...@gmail.com: So, as an example of what I'm shooting for, the following author has already done this with his book: http://book.personalmba.com/bootstrapping/ (full of promotional material) Even though this page is full of promotional material, the link itself has value, because now I can refer to that small section of the book and talk about my own ideas, relative to that section. It doesn't look to me like this page represents a section of a book. It says: This is a preview of a concept contained in The Personal MBA by Josh Kaufman... Note the word *concept*. Is this what you intended to show? kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Cataloguing multi-volume works from archive.org
Yes, the multi-volume book problem appears a handful of times in the bug list. We've talked about various solutions, including using the Table of Contents as a way to link the volumes to the bibliographic data. It's tricky because the Archive treats them as separate items since each one results in a separate digital file. At the moment there isn't a way (AFAIK) to link an entry in the archive to a volume listed in the ToC, so I don't know of a way to connect up v. 2 of the work to the digital file for v. 2. Has anyone added multiple Archive identifiers for multi-volume works on a single edition record? There wouldn't be a one-to-one link between the individual volumes and the correct identifier, but at least they'd be on the same edition record. Anyway, this is a known problem, and possibly one for which there isn't yet a good solution. kc Quoting Steve Thomas stephen.tho...@adelaide.edu.au: Hi, I'm looking at works in archive.org, and I see that the catalogue records for multi-volume works includes a separate record for each volume, typically without a volume number. E.g. http://openlibrary.org/works/OL20752W/Works_containing_additional_letters_tracts_and_poems_not_hitherto_published http://openlibrary.org/works/OL20752W/Works_containing_additional_letters_tracts_and_poems_not_hitherto_publishedwhere it looks like there are multiple editions, when in fact they are (almost) all separate volumes of the same edition. There's nowhere in Edit that lets me add a volume number, which means that a user would have to open each volume separately to find out which volume it is. Obviously, best practice would be to have all volumes listed as items under a single title, but it doesn't look like OL is set up for that. Next best would be to include the volume number in the title, to distinguish them. Thoughts? Steve -- Stephen Thomas, Senior Systems Analyst, Barr Smith Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 / Mobile: 0402 069 087 / Fax: +61 8 830 34369 Email: stephen.tho...@adelaide.edu.au URL: http://www.adelaide.edu.au/directory/stephen.thomas CRICOS Provider Number 00123M Editor of ebo...@adelaide http://ebooks.adelaide.edu.au/ [image: Facebook] http://www.facebook.com/spotrick[image: LinkedIn]http://au.linkedin.com/in/sgathomas[image: Flickr] http://www.flickr.com/photos/spotrick/[image: Twitter]http://twitter.com/spotrick[image: Delicious] http://delicious.com/spotrick[image: WordPress]http://spotrick.wordpress.com/[image: Google] http://www.google.com/profiles/st3v3th0ma5#buzz *Please consider the environment before printing.* IMPORTANT: This message may contain confidential or legally privileged information. If you think it was sent to you by mistake, please delete all copies and advise the sender. For the purposes of the SPAM Act 2003, this email is authorised by The University of Adelaide. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Cataloguing multi-volume works from archive.org
://www.flickr.com/photos/spotrick/[image: Twitter]http://twitter.com/spotrick[image: Delicious] http://delicious.com/spotrick[image: WordPress]http://spotrick.wordpress.com/[image: Google] http://www.google.com/profiles/st3v3th0ma5#buzz *Please consider the environment before printing.* IMPORTANT: This message may contain confidential or legally privileged information. If you think it was sent to you by mistake, please delete all copies and advise the sender. For the purposes of the SPAM Act 2003, this email is authorised by The University of Adelaide. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to Ol-discuss- unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Series
Quoting Patrick Conley p...@conley.de: Karen, Why not use the distinction between volume (Harry Potter) and series (Great Classics)? Patrick - You're saying that Harry Potter could be considered different volumes of the same story? That makes sense. However, there are also books that are published in more than one volume, but at the same time -- that is, where the book is simply broken into different physical parts even though it's one book. (That doesn't happen much today because the actual printing process can handle very large books, but about a century ago that was more common.) We also use volume for that situation. (Formally: multi-volume monographs) The Harry Potter books are separate books, separate stories, published at different times, each that can be read alone but they come one after another, which is why they are a series. In fact, there are many ways that books turn up in something we call a series... So I decided to try to define them all (I probably missed some, please add): - Some are groups of books that have something in common (often the same characters) but are not given a single name (Harry Potter books is an example -- there isn't any series name on the books themselves -- but there are also lots of these in mystery and detective books) - Some get a name after a while (Discworld books) - Some have a name from the beginning (Remembrance of Things Past, Great Books) - Some are born with a planned end (Remembrance again; Great Books; the Kinsey Millhone mysteries going from A to Z) - Some have no planned end and just grow until they stop for some reason (Harry Potter; the Stieg Larssen books, Discworld); the author loses interest, the publisher loses money, or the author dies. - Some are books by a single author (Harry Potter, Discworld, various mystery writers). These are sometimes called author series. - Some are books that are by different authors but have a theme of some kind (many scientific series are in this group, Vintage Classics, Literary Conversations Series, Star Wars). Sometimes these are referred to as publishers series and sometimes monographic series although the latter tend to be more formal a unit than the former. - Some are given numbers by the publisher, so that you can be sure you get every one of them. Sometimes these are sold as a subscription, like a journal. (University of California publications in classical philology. v. 5, no. 3). These are usually called Monographic series. Actually, this was kind of fun. I'll be interested to see what you all can add. kc -patrick Am 27.12.2010 02:20, schrieb Karen Coyle: Sarah, ... It seems to me that our vocabulary is poor in this area -- it doesn't make sense that both of these are called by the same name. If you have other ideas for what to call them -- please send a suggestion. kc ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Titles in a series?
Aha! I didn't realize it was hidden there, but will remember that for next time. No apology needed. kc Quoting Horst Gutmann ho...@zerokspot.com: Hello Karen, sounds great :-) Apologies are in order, though: After you mentioned that there already is a series field I finally found the Librarian mode button which revealed that field :-) Best wishes, Horst On Sun, Dec 26, 2010 at 4:29 PM, Karen Coyle kco...@kcoyle.net wrote: Horst, welcome to the list and to the ongoing debate about series :-). And thanks for offering to do some work on the Star Trek books. I agree with you, that the title of the individual book should go in the title field. In particular, that will make the automatic matching of book titles more accurate. The series can go in the series field. We have talked of creating lists (http://openlibrary.org/lists) for these popular series (Star Trek, Harry Potter, etc.) as a way to bring them together for those of us who like to read a whole series (and some of us want to read them in order). We have also talked of creating a special field for these reading series -- that is, books that have something in common for readers, as opposed to the series that are like Acta Whatsits v. 27 that are more of a way that publishers and libraries keep track of things. Meanwhile, putting the series name in the series field gets us toward those goals. Thanks again, kc Quoting Horst Gutmann ho...@zerokspot.com: Hi :-) First mail to this list so: Hello, everyone :-) I have a bunch of Star Trek books lying around and noticed that so far most of them catalogued in the OL were imported automatically and so I thought I could spend some time improving the meta data there. Given that they are mostly all part of a series or sub-series I'm not quite sure, though, how that should be represented in the title structure or *if* it even should be represented at all. For example: http://openlibrary.org/works/OL5886066W/Star_Trek_Destiny is listed as Star Trek: Destiny as the main title, which is actually the series title. IMO the primary name of this book is Gods of Night (which also exists as a duplicate entry with exactly this name) which is Book I of the series Star Trek: Destiny. Are there perhaps some examples I could look at? Thank you :-) -- Horst ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Series
Sarah, I think what it comes down to is that publishers are somewhat like anarchists -- they seem to live to NOT follow any rules. There are series that are made up of new works, and there are series that gather together works previously published (like all of those Great Classics series). So I think that whatever you can say about series is true, only lots of other things are true also. This makes it hard to know where to put series. In the case of something like Harry Potter, the series is clearly inherent in the work itself -- it's about the content of the books. But for those Great Classics series, it's something that was added later by a publisher as a way to organize a group of unrelated works. It seems to me that our vocabulary is poor in this area -- it doesn't make sense that both of these are called by the same name. If you have other ideas for what to call them -- please send a suggestion. kc Quoting Sarah Breau smbr...@hotmail.com: Just a thought, as I was editing a book that had several editions AND was in a series, I was thinking that maybe there should be a series field at the work level instead of (or maybe in addition to) at the edition level. It seems to me that in most cases, the series field would be the same across all editions, and so could be captured at the work level. Maybe that's just because I haven't yet come across a book where one edition was part of a series but other editions weren't... From: ol-discuss-requ...@archive.org Subject: Ol-discuss Digest, Vol 41, Issue 10 To: ol-discuss@archive.org Date: Sun, 26 Dec 2010 12:00:02 -0800 Send Ol-discuss mailing list submissions to ol-discuss@archive.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss or, via email, send a message with subject or body 'help' to ol-discuss-requ...@archive.org You can reach the person managing the list at ol-discuss-ow...@archive.org When replying, please edit your Subject line so it is more specific than Re: Contents of Ol-discuss digest... Today's Topics: 1. Titles in a series? (Horst Gutmann) 2. Re: Titles in a series? (Karen Coyle) 3. Re: Titles in a series? (Horst Gutmann) 4. Re: Titles in a series? (Karen Coyle) -- Message: 1 Date: Sun, 26 Dec 2010 00:03:44 +0100 From: Horst Gutmann ho...@zerokspot.com Subject: [ol-discuss] Titles in a series? To: ol-discuss@archive.org Message-ID: aanlktimd+jjn-kdizso+dgdxgd-7xv6md4qw-gmhb...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi :-) First mail to this list so: Hello, everyone :-) I have a bunch of Star Trek books lying around and noticed that so far most of them catalogued in the OL were imported automatically and so I thought I could spend some time improving the meta data there. Given that they are mostly all part of a series or sub-series I'm not quite sure, though, how that should be represented in the title structure or *if* it even should be represented at all. For example: http://openlibrary.org/works/OL5886066W/Star_Trek_Destiny is listed as Star Trek: Destiny as the main title, which is actually the series title. IMO the primary name of this book is Gods of Night (which also exists as a duplicate entry with exactly this name) which is Book I of the series Star Trek: Destiny. Are there perhaps some examples I could look at? Thank you :-) -- Horst -- Message: 2 Date: Sun, 26 Dec 2010 07:29:23 -0800 From: Karen Coyle kco...@kcoyle.net Subject: Re: [ol-discuss] Titles in a series? To: ol-discuss@archive.org Message-ID: 20101226072923.23325uq0lk0ku...@kcoyle.net Content-Type: text/plain; charset=ISO-8859-1; DelSp=Yes; format=flowed Horst, welcome to the list and to the ongoing debate about series :-). And thanks for offering to do some work on the Star Trek books. I agree with you, that the title of the individual book should go in the title field. In particular, that will make the automatic matching of book titles more accurate. The series can go in the series field. We have talked of creating lists (http://openlibrary.org/lists) for these popular series (Star Trek, Harry Potter, etc.) as a way to bring them together for those of us who like to read a whole series (and some of us want to read them in order). We have also talked of creating a special field for these reading series -- that is, books that have something in common for readers, as opposed to the series that are like Acta Whatsits v. 27 that are more of a way that publishers and libraries keep track of things. Meanwhile, putting the series name in the series field gets us toward those goals. Thanks again, kc Quoting Horst Gutmann ho...@zerokspot.com: Hi :-) First mail to this list so: Hello, everyone :-) I have
Re: [ol-discuss] Author missing in rdf
I see what you mean, but can't answer your question. To add to the description of the problem, in edit mode you can see the author's name in the author box but it does not appear in the rdf. It DOES show up in the json output, and the rdf output is correct for the other books which appear to have the same data. I'll need to pass this question along to someone who can poke around more in the innards of the data and see what they can discover. Edward? Anand? kc Quoting Joke Pol joke@dans.knaw.nl: The authorlist is empty in http://openlibrary.org/books/OL24466703M.rdf It makes the rdf invalid. Sibling editions of this book do have an author in the rdf. Did I something wrong when I creted this edition? If yes, how can I repair it? I fail to see the crucial difference when I try to edit these editions. The sibling editions are: http://openlibrary.org/books/OL16981702M.rdf http://openlibrary.org/books/OL22659142M.rdf http://openlibrary.org/books/OL24247168M.rdf -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Metadata in author name
George, thanks, that's great. I will try out some role input, just for fun! kc Quoting George Oates g...@archive.org: Just to clarify - It's possible to attach a role to a contributor at the edition level, and the list of available roles is wiki-editable. Cheers, george On 11/24/10 10:32 PM, Karen Coyle wrote: Yes, you are absolutely right, we should also move those names to the contributor area. At the moment, I don't believe that contributor has a place for role, but that's something else that would be useful. The other two (from old catalog and the series statements) are ones that have been noted before, and the idea was to handle them algorithmically. Fom old catalog comes in from Library of Congress records, and the series statements in titles from Amazon. kc Quoting Alan Millaramillar...@gmail.com: On Wed, Nov 24, 2010 at 10:05 AM, Karen Coylekco...@kcoyle.net wrote: It might be necessary to drop them out of the Amazon data gathering, although it would be a shame because they also contribute some of the long tail books to the database. I wonder it it wouldn't at least be possible to drop all of the instances of (translator) (case insensitive) from the author strings and see how much that clears these up. (I also saw a few cases of [translator] and there may be other patterns as well.) Personally, I don't think we should automate dropping them; it is good metadata. Rather, I think we should automate moving it into the additional people list. The trick will be coming up with some judicious pattern matching smarts. (But here is another fun one that probably should be just dropped: http://openlibrary.org/search/authors?q=from+old+catalog :-) I see quite a few cases where useful metadata could be moved from one field to another. Things such as book titles with series or edition suffixes like (Great Classics Series) or http://openlibrary.org/search?q=large+print+edition etc. These follow fairly regular patterns, so it could be automated with supervision. I'd like to automate some of that myself, but I haven't come across any references to bulk update tools for users. I've downloaded the dumps and grep'ed through them as information for author merges, but I haven't seen any way for me to do the actual updates besides a real browser. The API docs indicate they are read-only for remote users. Anyone have any techniques they are using currently for mass updates? - Alan ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Series titles: include individual ID or not?
Quoting Tom Morris tfmor...@gmail.com: I thought the whole point of having fields in a database record was to avoid having to do string parsing, with all its problems, to recover your original data. Tom, you are right that the ideal is to have a separate element for each separate bit of data. The OL began with a simple bibliographic description for various reasons. Having worked in the library field where the data can be very complex but can only be input by persons with considerable training, it is interesting to attempt to marry that level of complexity with a data model that can be grasped immediately by enthusiastic volunteers with no specific training in the area. As such, the OL has gone back and merged some data that was once kept apart (e.g. non-sorting beginnings to titles, called non-filing characters in MARC), and has separated out data elements that were initially combined (e.g. tables of contents with numbering and pagination). This series issue is one worth considering for modification. A clever aspect of the OL's design is that changes can be made to the data elements mid-life, as it were, without disrupting the database itself. At this point, adding a series volume number to the series field would allow those elements to be input going forward, and some programming could be used to catch up earlier series entries that do have volume numbers. That process would not be perfect, of course, but ... imperfection is just a fact of life when you have many sources of data like the OL has. The saving grace is the wiki-nature, which allows changes to take place while all versions remain part of the record. kc If multiple fields are going to be munged together into a single string, what are the escaping rules for delimiters contained in the original strings? What are the parsing rules? Tom kc Quoting Roger Loran Bailey rogerbaile...@aol.com: Well, there are standards and there are common usages. Most of us know, for example, what standard English is, but very few people actually talk that way. Standards of library cataloging are a bit more obscure though. If we have a professional librarian here I suppose we can get an answer to what the standard is. As for common usage, though, the most common catalog entries that I see place the number of the book in a series with the name of the series. In fact, examples of it being done differently do not come to mind right now. That is why I suspect that to be the standard, but I cannot be sure. The authorities who make up the standards can make up some pretty obscure ones sometimes. As for myself, I would place the number with the series title until someone who has the credentials says otherwise. _ _ _ Freedom is always and exclusively freedom for the one who thinks differently. - Rosa Luxemburg The Militant: http://www.themilitant.com Pathfinder Press: http://www.pathfinderpress.com Granma International: http://www.granma.cu/ingles/index.html - Original Message - From: Alan Millar amillar...@gmail.com To: Open Library -- general discussion ol-discuss@archive.org Sent: Tuesday, October 12, 2010 4:11 PM Subject: Re: [ol-discuss] Series titles: include individual ID or not? On Tue, Oct 12, 2010 at 12:35 PM, Roger Loran Bailey rogerbaile...@aol.com wrote: I think I would add the series number. There doesn't seem to be much point in identifying a book as being in a series if there is no indication of which one in the series it is. Yes, certainly, we want the number identifying which one in the series it is. To expand or clarify, then, I guess my specific question is whether that should go in the series name field, or in another field such as the subtitle. I know data gets conflated when translated between different databases, but I don't know what is considered the standard or proper way of describing the series collective and individual data (if there is such a thing). Thanks for the feedback. - Alan ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org ___ Ol-discuss mailing
Re: [ol-discuss] universal citation index
___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Samuel Klein identi.ca:sj w:user:sj ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org
Re: [ol-discuss] Analysing LCSH headings
___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet ___ Ol-discuss mailing list Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org