Re: [Foundation-l] Universal Library
You two seem to be talking past each other. Might I suggest that perhaps the quality of information on OPL and/or Wikipdia/Wikisource sites is rather different depending on whether you are reading in French or English? I don't know if this is the case but it could explain the discrepancies between your experiences. Birgitte SB --- On Thu, 9/3/09, David Goodman wrote: > From: David Goodman > Subject: Re: [Foundation-l] Universal Library > To: "Wikimedia Foundation Mailing List" > Date: Thursday, September 3, 2009, 2:19 PM > I have been re-reading their > documentation, and they have it well in > hand. We would do very well to confine ourselves to > matching up the > entries in the WMF projects alone. Some of the data in WMF > is more > accurate than some of the OL data, but I would not > say this to be a > general rule. Far from it: the proportion of incomplete or > inaccurate > entires in enWP is probably well over 50% for books. (for > journal > articles it is better, because of a project to link to the > pubmed > information) The accuracy & adequacy -- let > alone completeness-- of > the bibliographic information in WS is close to zero, > except where > there is a IA scan of the cover and title page, from which > full > bibliographic information might be derived, but cannot > necessarily be > taken at face value. > > The unification of editions is non-trivial, as using the > algorithm you > suggest, you will also have all works related to Verne, > and > additionally a combination of general and partial > translations, > children's books, comic adaptation, and whatever. > Modern library metadata provides for this to a certain > limited > extent--unfortunately most of the entries in current online > catalogs > do not show full modern data--many catalogs never had more > than > minimal records; Dublin core is probably not > generally considered to > be fully up to the problem either, at least in any current > implementation. > > Those working on the OL side are fully aware of this. They > have made > the decision to work towards inclusion of all usable & > obtainable data > sets, rather than only the ones that can be immediately > fully > harmonized. This was very wise decision, as the way in > which the > information is to be combined & related is not fully > developed, and , > if they were to wait for that, nothing would be entered. > There will > therefore be the problem of upgrading the records and the > record > structure in place--a problem that no large bibliographic > system has > ever fully handled properly--not that this incarnation of > OL is likely > to either. Bibliographers work for their time, not for all > time to > come. > > > David Goodman, Ph.D, M.L.S. > http://en.wikipedia.org/wiki/User_talk:DGG > > > > On Thu, Sep 3, 2009 at 6:38 AM, Yann Forget > wrote: > > David Goodman wrote: > >> I have read your proposal. I continue to be of the > opinion that we are > >> not competent to do this. Since the proposal > says, that "this project > >> requires as much database management knowledge as > librarian > >> knowledge," it confirms my opinion. You will never > merge the data > >> properly if you do not understand it. > > > > That's all the point that it needs to be join project: > database gurus > > with librarians. What I see is that OpenLibrary lacks > some basic > > features that Wikimedia projects have since a long > time (in Internet > > scale): easy redirects, interwikis, mergings, deletion > process, etc. > > Some of these are planned for the next version of > their software, but I > > still feel that sometimes they try to reinvent the > wheel we already have. > > > > OL claims to have 23 million book and author entries. > However many > > entries are duplicates of the same edition, not to > mention the same > > book, so the real number of unique entries is much > lower. I also see > > that Wikisource has data which are not included in > their database (and > > certainly also Wikipedia, but I didn't really check). > > > >> You suggest 3 practical steps > >> 1. an extension for finding a book in OL is > certainly doable--and it > >> has been done, see > >> [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. > >> 2. an OL field, link to WP -- as you say, this > is already present. > >> 3. An OL field, link to Wikisource. A very good > project. It will be > >> they who need to do it. > > > > Yes, but I think we should fo f
Re: [Foundation-l] Universal Library
I have been re-reading their documentation, and they have it well in hand. We would do very well to confine ourselves to matching up the entries in the WMF projects alone. Some of the data in WMF is more accurate than some of the OL data, but I would not say this to be a general rule. Far from it: the proportion of incomplete or inaccurate entires in enWP is probably well over 50% for books. (for journal articles it is better, because of a project to link to the pubmed information) The accuracy & adequacy -- let alone completeness-- of the bibliographic information in WS is close to zero, except where there is a IA scan of the cover and title page, from which full bibliographic information might be derived, but cannot necessarily be taken at face value. The unification of editions is non-trivial, as using the algorithm you suggest, you will also have all works related to Verne, and additionally a combination of general and partial translations, children's books, comic adaptation, and whatever. Modern library metadata provides for this to a certain limited extent--unfortunately most of the entries in current online catalogs do not show full modern data--many catalogs never had more than minimal records; Dublin core is probably not generally considered to be fully up to the problem either, at least in any current implementation. Those working on the OL side are fully aware of this. They have made the decision to work towards inclusion of all usable & obtainable data sets, rather than only the ones that can be immediately fully harmonized. This was very wise decision, as the way in which the information is to be combined & related is not fully developed, and , if they were to wait for that, nothing would be entered. There will therefore be the problem of upgrading the records and the record structure in place--a problem that no large bibliographic system has ever fully handled properly--not that this incarnation of OL is likely to either. Bibliographers work for their time, not for all time to come. David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG On Thu, Sep 3, 2009 at 6:38 AM, Yann Forget wrote: > David Goodman wrote: >> I have read your proposal. I continue to be of the opinion that we are >> not competent to do this. Since the proposal says, that "this project >> requires as much database management knowledge as librarian >> knowledge," it confirms my opinion. You will never merge the data >> properly if you do not understand it. > > That's all the point that it needs to be join project: database gurus > with librarians. What I see is that OpenLibrary lacks some basic > features that Wikimedia projects have since a long time (in Internet > scale): easy redirects, interwikis, mergings, deletion process, etc. > Some of these are planned for the next version of their software, but I > still feel that sometimes they try to reinvent the wheel we already have. > > OL claims to have 23 million book and author entries. However many > entries are duplicates of the same edition, not to mention the same > book, so the real number of unique entries is much lower. I also see > that Wikisource has data which are not included in their database (and > certainly also Wikipedia, but I didn't really check). > >> You suggest 3 practical steps >> 1. an extension for finding a book in OL is certainly doable--and it >> has been done, see >> [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. >> 2. an OL field, link to WP -- as you say, this is already present. >> 3. An OL field, link to Wikisource. A very good project. It will be >> they who need to do it. > > Yes, but I think we should fo further than that. OpenLibrary has an API > which would allow any relevant wiki article to be dynamically linked to > their data, or that an entry could be created every time new relevant > data is added to a Wikipedia projects. This is all about avoiding > duplicate work between Wikimedia and OpenLibrary. It could also increase > accuracy by double checking facts (dates, name and title spelling, etc.) > between our projects. > >> Agreed we need translation information--I think this is a very >> important priority. It's not that hard to do a list or to add links >> that will be helpful, though not exact enough to be relied on in >> further work. That's probably a reasonable project, but it is very >> far from "a database of all books ever published" >> >> But some of this is being done--see the frWP page for Moby Dick: >> http://fr.wikipedia.org/wiki/Moby_Dick >> (though it omits a number of the translations listed in the French Union >> Catalog, >> http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=RLV&TRM=Moby+Dick] >> I would however not warrant without seeing the items in hand, or >> reading an authoritative review, that they are all complete >> translations. >> The English page on the novel lists no translations; perhaps we could >> in practice assume that the interwiki links
Re: [Foundation-l] Universal Library
David Goodman wrote: > I have read your proposal. I continue to be of the opinion that we are > not competent to do this. Since the proposal says, that "this project > requires as much database management knowledge as librarian > knowledge," it confirms my opinion. You will never merge the data > properly if you do not understand it. That's all the point that it needs to be join project: database gurus with librarians. What I see is that OpenLibrary lacks some basic features that Wikimedia projects have since a long time (in Internet scale): easy redirects, interwikis, mergings, deletion process, etc. Some of these are planned for the next version of their software, but I still feel that sometimes they try to reinvent the wheel we already have. OL claims to have 23 million book and author entries. However many entries are duplicates of the same edition, not to mention the same book, so the real number of unique entries is much lower. I also see that Wikisource has data which are not included in their database (and certainly also Wikipedia, but I didn't really check). > You suggest 3 practical steps > 1. an extension for finding a book in OL is certainly doable--and it > has been done, see > [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. > 2. an OL field, link to WP -- as you say, this is already present. > 3. An OL field, link to Wikisource. A very good project. It will be > they who need to do it. Yes, but I think we should fo further than that. OpenLibrary has an API which would allow any relevant wiki article to be dynamically linked to their data, or that an entry could be created every time new relevant data is added to a Wikipedia projects. This is all about avoiding duplicate work between Wikimedia and OpenLibrary. It could also increase accuracy by double checking facts (dates, name and title spelling, etc.) between our projects. > Agreed we need translation information--I think this is a very > important priority. It's not that hard to do a list or to add links > that will be helpful, though not exact enough to be relied on in > further work. That's probably a reasonable project, but it is very > far from "a database of all books ever published" > > But some of this is being done--see the frWP page for Moby Dick: > http://fr.wikipedia.org/wiki/Moby_Dick > (though it omits a number of the translations listed in the French Union > Catalog, > http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=RLV&TRM=Moby+Dick] > I would however not warrant without seeing the items in hand, or > reading an authoritative review, that they are all complete > translations. > The English page on the novel lists no translations; perhaps we could > in practice assume that the interwiki links are sufficient. Perhaps > that could be assumed in Wiksource also? That's another possible benefit: automatic list of works/editions/translations in a Wikipedia article. You could add {{OpenLibrary|author=Jules Verne|lang=English}} and you have a list of English translations of Jules Verne's works directly imported from their database. The problem is that, right now, Wikimedia projects have often more accurate and more detailed information than OpenLibrary. > David Goodman, Ph.D, M.L.S. > http://en.wikipedia.org/wiki/User_talk:DGG Regards, Yann -- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Universal Library
I have read your proposal. I continue to be of the opinion that we are not competent to do this. Since the proposal says, that "this project requires as much database management knowledge as librarian knowledge," it confirms my opinion. You will never merge the data properly if you do not understand it. You suggest 3 practical steps 1. an extension for finding a book in OL is certainly doable--and it has been done, see [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. 2. an OL field, link to WP -- as you say, this is already present. 3. An OL field, link to Wikisource.A very good project. It will be they who need to do it. Agreed we need translation information--I think this is a very important priority. It's not that hard to do a list or to add links that will be helpful, though not exact enough to be relied on in further work. That's probably a reasonable project, but it is very far from "a database of all books ever published" But some of this is being done--see the frWP page for Moby Dick: http://fr.wikipedia.org/wiki/Moby_Dick (though it omits a number of the translations listed in the French Union Catalog, http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=RLV&TRM=Moby+Dick] I would however not warrant without seeing the items in hand, or reading an authoritative review, that they are all complete translations. The English page on the novel lists no translations; perhaps we could in practice assume that the interwiki links are sufficient. Perhaps that could be assumed in Wiksource also? David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG On Wed, Sep 2, 2009 at 1:17 PM, Yann Forget wrote: > Hello, I have already answered some of these arguments earlier. > > David Goodman wrote: >> Not only can the OpenLibrary do it perfect well without us. >> considering our rather inconsistent standards, they can probably do it >> better without us. We will just get in the way. > > The issue is not if OpenLibrary is "doing it perfect well without us", > even if that were true. Currently what OpenLibrary does is not very > useful for Wikimedia, and partly duplicate what we do. Wikimedia has > also important assets which OL doesn't have, and therefore a > collaboration seems obviously beneficial for both. > >> There is sufficient missing material in every Wikipedia, sufficient >> lack of coverage of areas outside the primary language zone and in >> earlier periods, sufficient unsourced material; sufficient need for >> updating articles, sufficient potentially free media to add, >> sufficient needed imagery to get; that we have more than enough work >> for all the volunteers we are likely to get. >> >> To duplicate an existing project is particularly unproductive when the >> other project is doing it better than we are ever going to be able to. >> Yes, there are people here who could do it or learn to do it--but I >> think everyone here with that degree of bibliographic knowledge would >> be much better occupied in sourcing articles. > > It is clear that you didn't even read my proposal. > Please do before emitting objections. > http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books_ever_published > > I specifically wrote that my proposal is not necessarily starting a new > project. I agree that working with Open Library is necessary for such > project, but I also say if Wikimedia gets involved, it would be much > more successful. > > What you say here is completely the opposite how Wikimedia projects > work, i.e. openness, and that's just what is missing in Open Library. > >> David Goodman, Ph.D, M.L.S. > > Regards, > Yann > -- > http://www.non-violence.org/ | Site collaboratif sur la non-violence > http://www.forget-me.net/ | Alternatives sur le Net > http://fr.wikisource.org/ | Bibliothèque libre > http://wikilivres.info | Documents libres > > ___ > foundation-l mailing list > foundation-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l > ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Universal Library
Hello, I have already answered some of these arguments earlier. David Goodman wrote: > Not only can the OpenLibrary do it perfect well without us. > considering our rather inconsistent standards, they can probably do it > better without us. We will just get in the way. The issue is not if OpenLibrary is "doing it perfect well without us", even if that were true. Currently what OpenLibrary does is not very useful for Wikimedia, and partly duplicate what we do. Wikimedia has also important assets which OL doesn't have, and therefore a collaboration seems obviously beneficial for both. > There is sufficient missing material in every Wikipedia, sufficient > lack of coverage of areas outside the primary language zone and in > earlier periods, sufficient unsourced material; sufficient need for > updating articles, sufficient potentially free media to add, > sufficient needed imagery to get; that we have more than enough work > for all the volunteers we are likely to get. > > To duplicate an existing project is particularly unproductive when the > other project is doing it better than we are ever going to be able to. > Yes, there are people here who could do it or learn to do it--but I > think everyone here with that degree of bibliographic knowledge would > be much better occupied in sourcing articles. It is clear that you didn't even read my proposal. Please do before emitting objections. http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books_ever_published I specifically wrote that my proposal is not necessarily starting a new project. I agree that working with Open Library is necessary for such project, but I also say if Wikimedia gets involved, it would be much more successful. What you say here is completely the opposite how Wikimedia projects work, i.e. openness, and that's just what is missing in Open Library. > David Goodman, Ph.D, M.L.S. Regards, Yann -- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Universal Library
Lars Aronsson wrote: > Yann Forget wrote: > >> I started a proposal on the Strategy Wiki: >> http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books_ever_published >> >> IMO this should be a join project between Openlibrary and Wikimedia. > > Again, I don't understand why. What exactly is missing in > OpenLibrary? Why does it need to be a new, joint project? > > The page says "There is currently no database of all books ever > published freely available." But OpenLibrary is a project already > working towards exactly that goal. It's not done yet, and its > methods are not yet fully developed. But neither would your new > "joint" project be, for a very long time. > > Wikipedia is also far from complete, far from containing "the sum > of all human knowledge". But that doesn't create a need to start > entirely new encyclopedia projects. It only means more > contributors are needed in the existing Wikipedia. You just give again the same arguments, to which I have answered. Did you read my answer? Regards, Yann -- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Universal Library
Not only can the OpenLibrary do it perfect well without us. considering our rather inconsistent standards, they can probably do it better without us. We will just get in the way. There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get. To duplicate an existing project is particularly unproductive when the other project is doing it better than we are ever going to be able to. Yes, there are people here who could do it or learn to do it--but I think everyone here with that degree of bibliographic knowledge would be much better occupied in sourcing articles. David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG On Wed, Sep 2, 2009 at 2:21 AM, Lars Aronsson wrote: > Yann Forget wrote: > >> I started a proposal on the Strategy Wiki: >> http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books_ever_published >> >> IMO this should be a join project between Openlibrary and Wikimedia. > > > Again, I don't understand why. What exactly is missing in > OpenLibrary? Why does it need to be a new, joint project? > > The page says "There is currently no database of all books ever > published freely available." But OpenLibrary is a project already > working towards exactly that goal. It's not done yet, and its > methods are not yet fully developed. But neither would your new > "joint" project be, for a very long time. > > Wikipedia is also far from complete, far from containing "the sum > of all human knowledge". But that doesn't create a need to start > entirely new encyclopedia projects. It only means more > contributors are needed in the existing Wikipedia. > > > -- > Lars Aronsson (l...@aronsson.se) > Aronsson Datateknik - http://aronsson.se > > > ___ > foundation-l mailing list > foundation-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l > ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Universal Library
Yann Forget wrote: > I started a proposal on the Strategy Wiki: > http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books_ever_published > > IMO this should be a join project between Openlibrary and Wikimedia. Again, I don't understand why. What exactly is missing in OpenLibrary? Why does it need to be a new, joint project? The page says "There is currently no database of all books ever published freely available." But OpenLibrary is a project already working towards exactly that goal. It's not done yet, and its methods are not yet fully developed. But neither would your new "joint" project be, for a very long time. Wikipedia is also far from complete, far from containing "the sum of all human knowledge". But that doesn't create a need to start entirely new encyclopedia projects. It only means more contributors are needed in the existing Wikipedia. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l