Re: [CODE4LIB] Something completely different
Ross, I'm not questioning the technical assertion -- obviously you can combine properties from different vocabularies. My problem is with making sense of FRBR in relation to the properties, either in RDA or in bibo. Do you say that a particular grouping of properties is of type FRBR:Manifestation, or is the property defined in the vocabulary as in the Manifestation domain? RDA does the latter (although not in a semantic web way). Each data element in RDA belongs to a particular FRBR entity, so you never actually use the FRBR entities in your metadata. (Although the examples that Alistair Miles did [1] use the levels as part of the record organization.) I actually prefer the usage that I gave in my examples, in which relationships carry the FRBR meaning and bibliographic properties can be used at any level. The schema in the registry is completely flat partly because of the choice made by RDA to include the FRBR levels in the data elements themselves. The other 'partly' is because the creators of RDA are still pretty much thinking in terms of traditional bibliographic data, ISBD and MARC. kc [1] Linked from each scenario at http://dublincore.org/dcmirdataskgroup/Scenarios Ross Singer wrote: Right, ok, so an RDF graph can say the same resource is multiple things at the same time, so that's how you deal with this: http://lccn.loc.gov/95100870 rdf:type bibo:Book . http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en . http://lccn.loc.gov/95100870 dc:creator http://www.worldcat.org/identities/lccn-n79-18438 . http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English . http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement . http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation . http://lccn.loc.gov/95100870 frbr:embodimentOf http://dbpedia.org/resource/Doctor_Zhivago . I'm guessing on the RDA assertions, because the schema in the metadataregistry doesn't make much sense to me. Anyway, this shows how you can say multiple things from different vocabularies for one resource. -Ross. On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote: Jonathan Rochkind wrote: I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) Well, I was responding to Ross' statement that bibo and FRBR could be used in combination, depending on whether one was at that moment describing 'bibliographic things' or 'work things'. bibo doesn't have a uniform title, so the question is: can you use a bibo title and say that it is a work title? I thought that Ross was indicating something of that nature -- that you could have a FRBR 'work thing' with bibo properties. I'm trying to understand how that works since Work is a class. Don't you have to indicate the domain and range of a property in its definition? RDA tries to solve this by creating different properties for every concept+FRBR entity: title of the work (Work), title proper (Manifestation). [I don't understand why expressions don't have titles a translation is an expression, after all.] I am confused about what one would do about the fact that RDA defines attributes a bit different than FRBR itself does. It's not too surprising -- FRBR is really just a draft, hardly tested in the world. When RDA tried to make it a bit more concrete, it's not surprising that they found they had to make changes to make it workable. Not sure what to do about that in the grand scheme of things, if RDA and FRBR both end up registering different vocabularies. I guess we'll just have two different vocabularies though, which isn't too shocking I guess. I'm not sure there's anything to do, but I do know that the developers of RDA feel very strongly that in RDA they have 'implemented' FRBR, so we have to find a way to integrate FRBR and RDA in the registered RDA vocabulary. I agree that there's no problem with having RDA and FRBR as two different vocabularies, it's the effort of bringing them together that boggles me. I feel like it leaves a lot of loose ends. I'd be happy to see FRBR revised, or to have it re-defined without the attributes, thus allowing metadata developers to use the bibliographic relationship properties with any set of descriptive elements. I'm having trouble with the FRBR Group 1 entities as classes. I see them instead as relationships, and vocab.org does seem to treat them as relationships, not as 'things.' I see a distinct difference between a person entity and a work entity, because there is no thing that is a work. I see work as a relationship
Re: [CODE4LIB] Something completely different
So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. The series-specific property names seem redundant, as well, since isn't SeriesStatement defining a series? Why do you need titleProperOfSeries if you already have titleProper? What does property 'uri' mean? I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? There are also different properties for dimensions, dimensionsOfMap, dimensionsOfStillImage, etc. Why is there any need for anything more than 'dimensions'? This is redefining what the resource 'is' in multiple places, but the fact that this is a still image is made somewhere else, right? If so, isn't it self-evident that the dimensions are of a still image? It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. -Ross. On Tue, Apr 7, 2009 at 9:39 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I'm not questioning the technical assertion -- obviously you can combine properties from different vocabularies. My problem is with making sense of FRBR in relation to the properties, either in RDA or in bibo. Do you say that a particular grouping of properties is of type FRBR:Manifestation, or is the property defined in the vocabulary as in the Manifestation domain? RDA does the latter (although not in a semantic web way). Each data element in RDA belongs to a particular FRBR entity, so you never actually use the FRBR entities in your metadata. (Although the examples that Alistair Miles did [1] use the levels as part of the record organization.) I actually prefer the usage that I gave in my examples, in which relationships carry the FRBR meaning and bibliographic properties can be used at any level. The schema in the registry is completely flat partly because of the choice made by RDA to include the FRBR levels in the data elements themselves. The other 'partly' is because the creators of RDA are still pretty much thinking in terms of traditional bibliographic data, ISBD and MARC. kc [1] Linked from each scenario at http://dublincore.org/dcmirdataskgroup/Scenarios Ross Singer wrote: Right, ok, so an RDF graph can say the same resource is multiple things at the same time, so that's how you deal with this: http://lccn.loc.gov/95100870 rdf:type bibo:Book . http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en . http://lccn.loc.gov/95100870 dc:creator http://www.worldcat.org/identities/lccn-n79-18438 . http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English . http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement . http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation . http://lccn.loc.gov/95100870 frbr:embodimentOf http://dbpedia.org/resource/Doctor_Zhivago . I'm guessing on the RDA assertions, because the schema in the metadataregistry doesn't make much sense to me. Anyway, this shows how you can say multiple things from different vocabularies for one resource. -Ross. On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote: Jonathan Rochkind wrote: I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) Well, I was responding to Ross' statement that bibo and FRBR could be used in combination, depending on whether one was at that moment describing 'bibliographic things' or 'work things'. bibo doesn't have a uniform title, so the question is: can you use a bibo title and say that it is a work title? I thought that Ross was indicating something of that nature -- that you could have a FRBR 'work thing' with bibo properties. I'm trying
Re: [CODE4LIB] Something completely different
On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Back to this original question, when I imagine these things, I imagine building an ILS that relies on an unusual data persistence backend, discounts industry-standard data formats, and explicitly ignores the political concerns of adopting, deploying, and maintaining it. And I get a little bit nervous. For what it's worth (and I think this touches on the ontological discussion in this thread, too) -- my experience has been that it's easier to build a piece of software that solves a problem compellingly, solving technical hurdles as you need to than it is to come up with solutions to anticipated technical problems before starting on making a product. More concretely: if you build a software product, I don't care at all whether it's based on a SQL straitjacket or a luscious RDF comforter. I care if it solves a problem well, and that I can install it and run it easily. Cheers, -Nate
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Ross Singer wrote: So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. Absolutely. I think there 's a real issue that NO technology folks were involved in the creation of RDA. So this is data from a cataloger's perspective, and from the perspective of guidance rules for creating bibliographic data. I'm pretty sure that we can't create a viable data record using the RDA data elements, and I hate the idea that the data format, once again, is an afterthought rather than integral to the data creation standard. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? Because they wanted their own definition. Everything in the RDA element list has an RDA-specific meaning, which then makes it impossible to use any existing data properties. But there's more: RDA was defining RDA cataloging rules, not a schema or record format. Not only are there multiple data elements where one could do, there are things that are missing. For example, the FRBR place entity can ONLY be used as a subject, so it really means place as subject. There's no general place element that could be used, for example, in place of publication. The latter has no relationship to FRBR place. This is a FRBR problem as much as an RDA problem, but again FRBR functions at a conceptual level and doesn't really provide a schema that one can work with. By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. Exactly. This is what I've been saying (or trying to say) in relation to the bibo discussion. You should be able to use whatever properties you want with the FRBR classes, and not restrict data elements to a single class. This is a big problem in RDA, but I can say that when it was brought up to them (JSC) they strongly defended this choice and would not budge. RDA, to JSC, has a specific relationship to FRBR, and if you use a data element with a different FRBR class, then you are no longer doing RDA. What does property 'uri' mean? Did you look at the rdf/xml? I'm wondering if it isn't the display that's confusing. I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? EVERYTHING is strings at the moment, with a very very few exceptions (like some dates, I think). Some data elements CAN use a controlled vocabulary, but I believe that all of those are a mixture of uncontrolled and controlled strings. People and institutions are mainly undefined because that is in the FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't feel it could do anything that would be too incompatible with the 'legacy' -- that is, with all of our AACR/MARC data. It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] RDA in RDF, was: Something completely different
See also the thread, 'RDA: A Standard Nobody Will Notice'. http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html A standard nobody will notice ... for good reason. Rob On Tue, 2009-04-07 at 18:24 +0100, Eric Lease Morgan wrote: On Apr 7, 2009, at 1:15 PM, Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record... Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information.
Re: [CODE4LIB] RDA in RDF, was: Something completely different
On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote: Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information. See also Fiander, David J. Applying XML to the Bibliographic Description. Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28. Fiander, David J., and D. Grant Campbell. An XML Definition for an ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4 (2003): 29-58. Which is what happens when a computer type starts de novo with the cataloguing standards and builds simple data structures.
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Karen, thanks for this summary of the process. It's pretty disheartening, sadly. I got 'uri' wrong, btw, it's Universal Resource Locator' !--Property: Uniform resource locator-- - rdf:Property rdf:about=http://RDVocab.info/Elements/uniformResourceLocator; rdfs:label xml:lang=enUniform resource locator/rdfs:label skos:definition xml:lang=en The address of a remote access resource. /skos:definition rdfs:isDefinedBy rdf:resource=http://RDVocab.info/Elements/ reg:status rdf:resource=http://metadataregistry.org/uri/RegStatus/1002/ /rdf:Property But again, not exactly the best use of the tools at their disposal. All this being said, it's really not too late to fix any of this, since nobody is implementing this and, realistically, nobody ever will. -Ross. On Tue, Apr 7, 2009 at 1:15 PM, Karen Coyle li...@kcoyle.net wrote: Ross Singer wrote: So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. Absolutely. I think there 's a real issue that NO technology folks were involved in the creation of RDA. So this is data from a cataloger's perspective, and from the perspective of guidance rules for creating bibliographic data. I'm pretty sure that we can't create a viable data record using the RDA data elements, and I hate the idea that the data format, once again, is an afterthought rather than integral to the data creation standard. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? Because they wanted their own definition. Everything in the RDA element list has an RDA-specific meaning, which then makes it impossible to use any existing data properties. But there's more: RDA was defining RDA cataloging rules, not a schema or record format. Not only are there multiple data elements where one could do, there are things that are missing. For example, the FRBR place entity can ONLY be used as a subject, so it really means place as subject. There's no general place element that could be used, for example, in place of publication. The latter has no relationship to FRBR place. This is a FRBR problem as much as an RDA problem, but again FRBR functions at a conceptual level and doesn't really provide a schema that one can work with. By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. Exactly. This is what I've been saying (or trying to say) in relation to the bibo discussion. You should be able to use whatever properties you want with the FRBR classes, and not restrict data elements to a single class. This is a big problem in RDA, but I can say that when it was brought up to them (JSC) they strongly defended this choice and would not budge. RDA, to JSC, has a specific relationship to FRBR, and if you use a data element with a different FRBR class, then you are no longer doing RDA. What does property 'uri' mean? Did you look at the rdf/xml? I'm wondering if it isn't the display that's confusing. I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? EVERYTHING is strings at the moment, with a very very few exceptions (like some dates, I think). Some data elements CAN use a controlled vocabulary, but I believe that all of those are a mixture of uncontrolled and controlled strings. People and institutions are mainly undefined because that is in the FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't feel it could do anything that would be too incompatible with the 'legacy' -- that is, with all of our AACR/MARC data. It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Roy, That's true. Unfortunately, I missed Kevin's talk at Access '02 in Windsor, and since I wrote the first of those two papers I've mostly been out of the loop, since it's not my area any more. - David On Tue, Apr 7, 2009 at 1:48 PM, Roy Tennant tenna...@oclc.org wrote: Well, and then you have the XOBIS work from Stanford that ksclarke was involved with. Roy On 4/7/09 4/7/09 € 10:41 AM, David Fiander da...@fiander.info wrote: On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote: Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information. See also Fiander, David J. Applying XML to the Bibliographic Description. Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28. Fiander, David J., and D. Grant Campbell. An XML Definition for an ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4 (2003): 29-58. Which is what happens when a computer type starts de novo with the cataloguing standards and builds simple data structures. --
Re: [CODE4LIB] RDA in RDF, was: Something completely different
It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] RDA in RDF, was: Something completely different
On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.eduwrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. Hear, hear. I don't think we'll see a real solution unless we consider both the tech-folks' and the catalogers' concerns. I'm also sympathetic to knowledge domains wanting to have control over the meaning of their data elements (to have a useful and well defined set). How we move forward when we have so much legacy data (and supporting systems), as Anna said, is a difficult problem. Thanks for the plug Roy. The checks in the mail. ;-) Kevin -- Kevin S. Clarke Coordinator of Web Services Belk Library Information Commons Appalachian State University 218 College Street Boone, NC 28608 clark...@appstate.edu (828) 262-8472 There are two kinds of people in the world: those who believe there are two kinds of people and those who know better.
Re: [CODE4LIB] Something completely different
Also back to the original question, what is an ILS in the first place? The discussion has focused on bibliographic records, but that's just one part of what's in the ILS in use at the library where I work. I see one of the big problems with current ILSs being not so much the ILS per se, but library managers'/librarians' expectations that they should have a single core system that handles all the following functionality: - maintaining a database of patron records with attached fine and fee information, which books they have out, what is waiting on the hold shelf for them, etc. - maintaining a library accounting hierarchy with the ability to run reports like it's halfway through the year and you've spent 90% of your budget for children's fiction - maintaining an acquisitions system so records for purchases are reflected into the accounting system and also as new bib records for on-order materials - serials check-in so that missing issues can be claimed - and of course a cataloging module and an OPAC. Without the ability to support all the back-end processing and accounting, simply replacing the front-end OPAC and the bibliographic database does nothing to eliminate the need for an ILS, unless it also opens the way to feed data in and out of cheap off-the-shelf accounting and purchasing systems that aren't library-specific. A lot of libraries still won't want to put together even that much out of parts, and will prefer an ILS, but if it were me, I think I'd look at reengineering some of the parts to become more interchangeable with stuff like standard accounting software. I must admit I was kind of horrified when I first got here and found that all this functionality was resident in a single system. No wonder these things are so honking expensive. Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org njv...@wisc.edu 04/07/09 08:59AM On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Back to this original question, [...]
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Well, there's the project by Alistair Miles that Karen alluded to earlier: http://code.google.com/p/code4rda The goals of this project are, in my mind, crucial in moving forward, since it's taking our existing corpus of records and turning them into RDA/RDF. Not only is it a good proof of concept to show how these new data models would look and work (esp. how they would work w/r/t to current applications/workflows), but, more importantly, it shows it can be done *with our current data* alleviating the need for some unrealistic retrospective recataloging effort. I guess the way I look at it is, there's still time to fix this, at least technologically. There is a difference between the standard, the data model and the application. Karen posted a couple of weeks back that UKMARC didn't include punctuation, instead leaving it to technology to add it. This doesn't mean they didn't follow AACR2, they just didn't encode it into the data fields, explicitly. Of course, they gave this up when they adopted MARC21. Anyway, there's a separation of concerns that is currently being blurred, but doesn't have to be in practice. -Ross. On Tue, Apr 7, 2009 at 2:25 PM, Anna Headley ahead...@swarthmore.edu wrote: But the first one to take this on has no one to grab from. The sharing argument may be a red herring in that the problem, from some perspectives, isn't so much about sharing one's own work -- it's more about using others' work. Or is there already a community of people doing something like what Ross describes? If so, where can I find out more about who, and how this works? It seems to me that the best movements forward in this opening of data are centered on translating marc into more web-usable forms. Which is great**... for everyone except catalogers with no love for marc. Jakob makes a good point in the post that Rob pointed out (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when cataloging can look like librarything, the rules *and, I would add, tools* we use seem incredibly bloated. ** I do mean great. We have to start somewhere. It's just that the cataloging pieces move so excruciatingly slowly. ah Ross Singer wrote: It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] Something completely different
Which is why the interface specifications are at least as important, if not more important, as the specs for each of the modules that you enumerated. If the interfaces are well-defined, then the components can be designed and developed with a minimum of further interactions among developers. In fact, there might eventually be more than one implementation of a particular module, allowing a library to assemble an ILS out of interchangeable components. (I'm assuming open source--it seems unlikely that proprietary vendors will ever come around.) Sharon M. Foster, 91.7% Librarian Speaker-to-Computers http://www.vsa-software.com/mlsportfolio/ On Tue, Apr 7, 2009 at 2:52 PM, Genny Engel gen...@sonoma.lib.ca.us wrote: Also back to the original question, what is an ILS in the first place? [...] Without the ability to support all the back-end processing and accounting, simply replacing the front-end OPAC and the bibliographic database does nothing to eliminate the need for an ILS, unless it also opens the way to feed data in and out of cheap off-the-shelf accounting and purchasing systems that aren't library-specific. A lot of libraries still won't want to put together even that much out of parts, and will prefer an ILS, but if it were me, I think I'd look at reengineering some of the parts to become more interchangeable with stuff like standard accounting software. I must admit I was kind of horrified when I first got here and found that all this functionality was resident in a single system. No wonder these things are so honking expensive. Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org
Re: [CODE4LIB] RDA in RDF, was: Something completely different
But the first one to take this on has no one to grab from. The sharing argument may be a red herring in that the problem, from some perspectives, isn't so much about sharing one's own work -- it's more about using others' work. Or is there already a community of people doing something like what Ross describes? If so, where can I find out more about who, and how this works? It seems to me that the best movements forward in this opening of data are centered on translating marc into more web-usable forms. Which is great**... for everyone except catalogers with no love for marc. Jakob makes a good point in the post that Rob pointed out (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when cataloging can look like librarything, the rules *and, I would add, tools* we use seem incredibly bloated. ** I do mean great. We have to start somewhere. It's just that the cataloging pieces move so excruciatingly slowly. ah Ross Singer wrote: It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Ross Singer wrote: Well, there's the project by Alistair Miles that Karen alluded to earlier: http://code.google.com/p/code4rda The goals of this project are, in my mind, crucial in moving forward, since it's taking our existing corpus of records and turning them into RDA/RDF. Not only is it a good proof of concept to show how these new data models would look and work (esp. how they would work w/r/t to current applications/workflows), but, more importantly, it shows it can be done *with our current data* alleviating the need for some unrealistic retrospective recataloging effort. I guess the way I look at it is, there's still time to fix this, at least technologically. There is a difference between the standard, the data model and the application. An interesting experiment would be to attempt to use the cataloger's use cases that Alistair worked from, but instead of using the RDA vocabulary to use bibo+vocab.org/frbr. That would give us something comparative to look at. If bibo+frbr can do all or even a lot of what RDA does, then we can demonstrate a different model and explain why one is better than the other (or at least that more than one model will work). kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
[CODE4LIB] Cyberinfrastructure Summer Internships for repository interoperability: application deadline reminder
*** Please disseminate widely to students at your institution *** CYBERINFRASTRUCTURE SUMMER INTERNSHIPS 2009 - REMINDER: Student Application Deadline is April 13, 2009 http://hackathon.nescent.org/ Cyberinfrastructure_Summer_Traineeships_2009 Summer training internships are available for up to four students and postdocs interested in informatics as applied to scientific data in such fields as biodiversity, ecology, and evolutionary biology. The program provides a unique opportunity for undergraduate, masters, and PhD students as well as postdocs to obtain hands-on experience writing and extending open-source software as part of a distributed collaborative software development team building a Virtual Data Center (VDC) that includes major data and metadata repositories in those fields. The application deadline for students (April 13, 2009) is approaching rapidly. Trainees accepted into the program will receive a stipend ($4,500), and with the exception of attending one meeting near the beginning and one near the end of the 3-month program period may work from their home, or home institution. Travel costs incurred in connection with the meetings will be reimbursed. Each student will have at least one dedicated mentor to show them the ropes and help them complete their project. Initial project ideas are listed on the website. These range from validation of metadata and identifier resolution, to supporting LSID and semantic-web compliant PURLs for digital data objects, to implementing modern web-service APIs, to cataloging the diversity of metadata schemas. The project ideas are flexible and can be adjusted in scope to match the skills of the student. We also welcome novel project ideas that dovetail with student interests. The program is supported by a National Science Foundation (NSF) grant to a consortium of major repositories for biodiversity, earth and environmental, ecological, and evolutionary science. The consortium includes the LTER Network Office, the U.S. Geological Survey, NASA and Oak Ridge National Laboratory, the Global Biodiversity Information Facility (GBIF), the National Evolutionary Synthesis Center(NESCent), and the National Center for Ecological Analysis and Synthesis (NCEAS). It aims to develop the cyberinfrastructure and technologies necessary to build a Virtual Data Center (VDC) based on a network of existing and new physical repositories (nodes) that interoperate using open standards and protocols. The network will enable discovery of as well as open, stable, and secure access to data in any of its member nodes. TO APPLY: Students apply online. Instructions for applying are at the website (see When you apply), along with program rules and eligibility requirements. The 15-day application period for students end on Monday, April 13th, 2009. INQUIRIES: vdc-twg {at} ecoinformatics {dot} org. We strongly encourage all interested students to get in touch with us with their ideas as early as possible. Cyberinfrastructure Traineeships Website: http://hackathon.nescent.org/ Cyberinfrastructure_Summer_Traineeships_2009 To sign up for quarterly NESCent newsletters: http://www.nescent.org/ about/contact.php - Todd Vision and Hilmar Lapp National Evolutionary Synthesis Center http://nescent.org
Re: [CODE4LIB] registering info: uris?
no, that's not at all what it implies. the ofi/name identifiers were minted as identifiers for namespaces of indentifiers, not as a wrapper scheme for the identifiers themselves. Yes, it's a bit TOO meta, but they can be safely ignored unless a new profile is desired. On Apr 5, 2009, at 10:31 AM, Karen Coyle wrote: Jonathan Rochkind wrote: URI for an ISBN or SuDocs? I don't think the GPO is going anywhere, but the GPO isn't committing to supporting an http URI scheme, and whoever is, who knows if they're going anywhere. That issue is certainly mitigated by Ross using purl.org for these, instead of his own personal http URI. But another issue that makes us want a controlling authority is increasing the chances that everyone will use the _same_ URI. If GPO were behind the purl.org/ NET/sudoc URIs, those chances would be high. Just Ross on his own, the chances go down, later someone else (OCLC, GPO, some other guy like Ross) might accidentally create a 'competitor', which would be unfortunate. Note this isn't as much of a problem for born web resources -- nobody's going to accidentally create an alternate URI for a dbpedia term, because anybody that knows about dbpedia knows that it lives at dbpedia. So those are my thoughts. Now everyone else can argue bitterly over them for a while. :) The ones that really puzzle me, however, are the OpenURL info namespace URIs for ftp, http, https and info. This implies that EVERY identifier used by OpenURL needs an info URI, even if it is a URI in its own right. They are under info:ofi/nam which is called Namespace reserved for registry identifiers of namespaces. There's something so circular about this that I just get a brain dump when I try to understand it. Does it make sense to anyone? kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234 Eric Hellman http://hellman.net/eric/
Re: [CODE4LIB] Something completely different
An interesting thread! It will take me a while for me to digest the ideas. What I had in mind for something different is this: Think of a single database of only associations between objects, and nothing more than that. Objects defined in this database can reference any and all other objects in the database. These objects could represent anything: Title records or item records in an opac. A collection of files on a computer. Web sites. Links. Database queries. All of the above. Each object in this database contains just enough information to say that it exists and has a pointer to the thing in the outside world that it represents. Although the basic system would allow the objects in it to link to eachother in arbitrary ways, we could impose rules on it to create a system. An OPAC. A map. Other things that I can't think of right now. I think a key thought here is that it is a database of pure relationships that can be set up and manipulated. But the descriptive data is stored elsewhere. It allows for an interesting extension too -- weighting those associations. Suppose we use it to create a search structure, and each time we go from one object referencing another we increment a counter for that link by one. There are many ways to implement something like this, and I have one in mind, but this is sort of the theory behind it. It is going back to simple things. Peter Schlumpf -Original Message- From: Karen Coyle li...@kcoyle.net Sent: Apr 6, 2009 1:49 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, Well, speaking of 'without formal training' -- I posted this to the Open Library technology list, but using the OL, which is triple-based and open access, I was able to create a simple demo Pipe of how you could determine the earliest date of publication of a book (with an interest in looking at potential copyright status). Caveat is that the API I'm is still pretty stubby, so it only retrieves on exact title (this will be fixed sometime in the future). The pipe is here: http://pipes.yahoo.com/pipes/pipe.info?_id=216efa8c3b04764ca77ad181b1cc66e4 kc the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: