Re: [CODE4LIB] Something completely different

2009-04-07 Thread Karen Coyle

Ross,

I'm not questioning the technical assertion -- obviously you can combine 
properties from different vocabularies. My problem is with making sense 
of FRBR in relation to the properties, either in RDA or in bibo. Do you 
say that a particular grouping of properties is of type 
FRBR:Manifestation, or is the property defined in the vocabulary as in 
the Manifestation domain? RDA does the latter (although not in a 
semantic web way). Each data element in RDA belongs to a particular 
FRBR entity, so you never actually use the FRBR entities in your 
metadata. (Although the examples that Alistair Miles did [1] use the 
levels as part of the record organization.) I actually prefer the usage 
that I gave in my examples, in which relationships carry the FRBR 
meaning and bibliographic properties can be used at any level.


The schema in the registry is completely flat partly because of the 
choice made by RDA to include the FRBR levels in the data elements 
themselves. The other 'partly' is because the creators of RDA are still 
pretty much thinking in terms of traditional bibliographic data, ISBD 
and MARC.


kc
[1] Linked from each scenario at 
http://dublincore.org/dcmirdataskgroup/Scenarios


Ross Singer wrote:

Right, ok, so an RDF graph can say the same resource is multiple
things at the same time, so that's how you deal with this:

http://lccn.loc.gov/95100870 rdf:type bibo:Book .
http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en .
http://lccn.loc.gov/95100870 dc:creator
http://www.worldcat.org/identities/lccn-n79-18438 .
http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English .
http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement .
http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation .
http://lccn.loc.gov/95100870 frbr:embodimentOf
http://dbpedia.org/resource/Doctor_Zhivago .

I'm guessing on the RDA assertions, because the schema in the
metadataregistry doesn't make much sense to me.

Anyway, this shows how you can say multiple things from different
vocabularies for one resource.

-Ross.
On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote:
  

Jonathan Rochkind wrote:


I'm curious why you think that doesn't work?  Isn't place of publication
a characteristic of a particular manifestation? While, title, according to
traditional library practices where you take it from the title page, is also
a characteristic of a particular manifestation, is it not? (uniform title
is _usually_ a characteristic of a work, unless we get into music cataloging
and some other 'edge' cases. Our traditional practices -- which aren't
actually changed that much by RDA, are rather confusing.)
  

Well, I was responding to Ross' statement that bibo and FRBR could be used
in combination, depending on whether one was at that moment describing
'bibliographic things' or 'work things'. bibo doesn't have a uniform title,
so the question is: can you use a bibo title and say that it is a work
title? I thought that Ross was indicating something of that nature -- that
you could have a FRBR 'work thing' with bibo properties. I'm trying to
understand how that works since Work is a class. Don't you have to indicate
the domain and range of a property in its definition?

RDA tries to solve this by creating different properties for every
concept+FRBR entity: title of the work (Work), title proper (Manifestation).
[I don't understand why expressions don't have titles a translation is
an expression, after all.]


I am confused about what one would do about the fact that RDA defines
attributes a bit different than FRBR itself does. It's not too surprising --
FRBR is really just a draft, hardly tested in the world. When RDA tried to
make it a bit more concrete, it's not surprising that they found they had to
make changes to make it workable. Not sure what to do about that in the
grand scheme of things, if RDA and FRBR both end up registering different
vocabularies. I guess we'll just have two different vocabularies though,
which isn't too shocking I guess.

  

I'm not sure there's anything to do, but I do know that the developers of
RDA feel very strongly that in RDA they have 'implemented' FRBR, so we have
to find a way to integrate FRBR and RDA in the registered RDA vocabulary. I
agree that there's no problem with having RDA and FRBR as two different
vocabularies, it's the effort of bringing them together that boggles me. I
feel like it leaves a lot of loose ends. I'd be happy to see FRBR revised,
or to have it re-defined without the attributes, thus allowing metadata
developers to use the bibliographic relationship properties with any set of
descriptive elements.

I'm having trouble with the FRBR Group 1 entities as classes. I see them
instead as relationships, and vocab.org does seem to treat them as
relationships, not as 'things.' I see a distinct difference between a person
entity and a work entity, because there is no thing that is a work. I see
work as a relationship 

Re: [CODE4LIB] Something completely different

2009-04-07 Thread Ross Singer
So, thanks to the help of my coworkers, here's the RDA Elements schema
reformatted in an easier to read presentation:
http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback=

I have to say I feel like this schema is trying to both do way too
much and subsequently loses the resource specificity that RDF would be
providing.

For one thing, it seems to reinvent a _lot_ of wheels.  Why does it
define its own title property instead of using DC's?  By using
properties like titleOfTheWork, dateOfWork and all of the properties
that are specifically about TheSeries there is tremendous duplication
of text.  If Work was its own class, you would only need say that this
manifestation was an embodimentOf of it and reuse all of the
title-based properties for manifestation.  The series-specific
property names seem redundant, as well, since isn't SeriesStatement
defining a series?  Why do you need titleProperOfSeries if you already
have titleProper?

What does property 'uri' mean?

I also can't figure out how people/institutions are modeled in this
schema, since none of the elements have ranges.  Are they their own
resources?  If so, what?  The way it looks at a glance, they're
strings?

There are also different properties for dimensions, dimensionsOfMap,
dimensionsOfStillImage, etc.  Why is there any need for anything more
than 'dimensions'?  This is redefining what the resource 'is' in
multiple places, but the fact that this is a still image is made
somewhere else, right?  If so, isn't it self-evident that the
dimensions are of a still image?

It seems to me that very little work was done find preexisting
vocabularies to reuse and this schema still presents a very
'document-centric' or 'record-centric' view of data.

-Ross.

On Tue, Apr 7, 2009 at 9:39 AM, Karen Coyle li...@kcoyle.net wrote:
 Ross,

 I'm not questioning the technical assertion -- obviously you can combine
 properties from different vocabularies. My problem is with making sense of
 FRBR in relation to the properties, either in RDA or in bibo. Do you say
 that a particular grouping of properties is of type FRBR:Manifestation, or
 is the property defined in the vocabulary as in the Manifestation domain?
 RDA does the latter (although not in a semantic web way). Each data element
 in RDA belongs to a particular FRBR entity, so you never actually use the
 FRBR entities in your metadata. (Although the examples that Alistair Miles
 did [1] use the levels as part of the record organization.) I actually
 prefer the usage that I gave in my examples, in which relationships carry
 the FRBR meaning and bibliographic properties can be used at any level.

 The schema in the registry is completely flat partly because of the choice
 made by RDA to include the FRBR levels in the data elements themselves. The
 other 'partly' is because the creators of RDA are still pretty much thinking
 in terms of traditional bibliographic data, ISBD and MARC.

 kc
 [1] Linked from each scenario at
 http://dublincore.org/dcmirdataskgroup/Scenarios

 Ross Singer wrote:

 Right, ok, so an RDF graph can say the same resource is multiple
 things at the same time, so that's how you deal with this:

 http://lccn.loc.gov/95100870 rdf:type bibo:Book .
 http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en .
 http://lccn.loc.gov/95100870 dc:creator
 http://www.worldcat.org/identities/lccn-n79-18438 .
 http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English
 .
 http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement .
 http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation .
 http://lccn.loc.gov/95100870 frbr:embodimentOf
 http://dbpedia.org/resource/Doctor_Zhivago .

 I'm guessing on the RDA assertions, because the schema in the
 metadataregistry doesn't make much sense to me.

 Anyway, this shows how you can say multiple things from different
 vocabularies for one resource.

 -Ross.
 On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote:


 Jonathan Rochkind wrote:


 I'm curious why you think that doesn't work?  Isn't place of
 publication
 a characteristic of a particular manifestation? While, title,
 according to
 traditional library practices where you take it from the title page, is
 also
 a characteristic of a particular manifestation, is it not? (uniform
 title
 is _usually_ a characteristic of a work, unless we get into music
 cataloging
 and some other 'edge' cases. Our traditional practices -- which aren't
 actually changed that much by RDA, are rather confusing.)


 Well, I was responding to Ross' statement that bibo and FRBR could be
 used
 in combination, depending on whether one was at that moment describing
 'bibliographic things' or 'work things'. bibo doesn't have a uniform
 title,
 so the question is: can you use a bibo title and say that it is a work
 title? I thought that Ross was indicating something of that nature --
 that
 you could have a FRBR 'work thing' with bibo properties. I'm trying 

Re: [CODE4LIB] Something completely different

2009-04-07 Thread Nate Vack
On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote:

 I want to get back to simple things.  Imagine if there were no Marc records.  
 Minimal layers of abstraction.  No politics.  No vendors.  No SQL 
 straightjacket.  What would an ILS look like without those things?

Back to this original question, when I imagine these things, I imagine
building an ILS that relies on an unusual data persistence backend,
discounts industry-standard data formats, and explicitly ignores the
political concerns of adopting, deploying, and maintaining it.

And I get a little bit nervous.

For what it's worth (and I think this touches on the ontological
discussion in this thread, too) -- my experience has been that it's
easier to build a piece of software that solves a problem
compellingly, solving technical hurdles as you need to than it is to
come up with solutions to anticipated technical problems before
starting on making a product.

More concretely: if you build a software product, I don't care at all
whether it's based on a SQL straitjacket or a luscious RDF comforter.
I care if it solves a problem well, and that I can install it and run
it easily.

Cheers,
-Nate


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Karen Coyle

Ross Singer wrote:

So, thanks to the help of my coworkers, here's the RDA Elements schema
reformatted in an easier to read presentation:
http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback=

I have to say I feel like this schema is trying to both do way too
much and subsequently loses the resource specificity that RDF would be
providing.
  


Absolutely. I think there 's a real issue that NO technology folks were 
involved in the creation of RDA. So this is data from a cataloger's 
perspective, and from the perspective of guidance rules for creating 
bibliographic data. I'm pretty sure that we can't create a viable data 
record using the RDA data elements, and I hate the idea that the data 
format, once again, is an afterthought rather than integral to the data 
creation standard.



For one thing, it seems to reinvent a _lot_ of wheels.  Why does it
define its own title property instead of using DC's? 


Because they wanted their own definition. Everything in the RDA element 
list has an RDA-specific meaning, which then makes it impossible to use 
any existing data properties. But there's more: RDA was defining RDA 
cataloging rules, not a schema or record format. Not only are there 
multiple data elements where one could do, there are things that are 
missing. For example, the FRBR place entity can ONLY be used as a 
subject, so it really means place as subject. There's no general 
place element that could be used, for example, in place of 
publication. The latter has no relationship to FRBR place. This is a 
FRBR problem as much as an RDA problem, but again FRBR functions at a 
conceptual level and doesn't really provide a schema that one can work with.



 By using
properties like titleOfTheWork, dateOfWork and all of the properties
that are specifically about TheSeries there is tremendous duplication
of text.  If Work was its own class, you would only need say that this
manifestation was an embodimentOf of it and reuse all of the
title-based properties for manifestation. 


Exactly. This is what I've been saying (or trying to say) in relation to 
the bibo discussion. You should be able to use whatever properties you 
want with the FRBR classes, and not restrict data elements to a single 
class. This is a big problem in RDA, but I can say that when it was 
brought up to them (JSC) they strongly defended this choice and would 
not budge. RDA, to JSC, has a specific relationship to FRBR, and if you 
use a data element with a different FRBR class, then you are no longer 
doing RDA.


 
What does property 'uri' mean?
  


Did you look at the rdf/xml? I'm wondering if it isn't the display 
that's confusing.



I also can't figure out how people/institutions are modeled in this
schema, since none of the elements have ranges.  Are they their own
resources?  If so, what?  The way it looks at a glance, they're
strings?
  


EVERYTHING is strings at the moment, with a very very few exceptions 
(like some dates, I think). Some data elements CAN use a controlled 
vocabulary, but I believe that all of those are a mixture of 
uncontrolled and controlled strings. People and institutions are mainly 
undefined because that is in the FRAD realm. And FRAD hasn't been 
finalized. Also note that the JSC didn't feel it could do anything that 
would be too incompatible with the 'legacy' -- that is, with all of our 
AACR/MARC data.



It seems to me that very little work was done find preexisting
vocabularies to reuse and this schema still presents a very
'document-centric' or 'record-centric' view of data.
  


Absolutely. The catalogers are still creating a textual document, not 
data. At best you can mark up the text, as we do with the MARC record. I 
worry that we won't be able to mesh the cataloger's view with a data 
view -- that the two are some how inherently opposed. I'd like to start 
modeling a new data format but I can't imagine how we can bridge the gap 
between the catalogers and the system view. I suppose a very clever 
interface could hide the data view from the catalogers, but starting 
from either AACR2 or RDA and trying to get there feels extremely 
difficult. I guess my fear is that it will require compromises, and 
those will be hard to negotiate.


kc

p.s. The RDA element analysis is at 
http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. 
That was the input to the registry.


--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Rob Sanderson
See also the thread, 'RDA: A Standard Nobody Will Notice'.

http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html

A standard nobody will notice ... for good reason. 

Rob

On Tue, 2009-04-07 at 18:24 +0100, Eric Lease Morgan wrote:
 On Apr 7, 2009, at 1:15 PM, Karen Coyle wrote:
 
  Absolutely. The catalogers are still creating a textual document, not
  data. At best you can mark up the text, as we do with the MARC  
  record...
 
 
 Listen...  What you hear from over here is the sound of a very heavy  
 sigh coming from a computer type who really wants to help improve the  
 way library data is used in a networked environment, but they can't  
 convince their own to modify the way they encode information.
 


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread David Fiander
On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote:
 Listen...  What you hear from over here is the sound of a very heavy sigh
 coming from a computer type who really wants to help improve the way library
 data is used in a networked environment, but they can't convince their own
 to modify the way they encode information.

See also

Fiander, David J. Applying XML to the Bibliographic Description.
Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28.

Fiander, David J., and D. Grant Campbell. An XML Definition for an
ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4
(2003): 29-58.

Which is what happens when a computer type starts de novo with the
cataloguing standards and builds simple data structures.


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
Karen, thanks for this summary of the process.  It's pretty
disheartening, sadly.

I got 'uri' wrong, btw, it's Universal Resource Locator'
!--Property: Uniform resource locator--
-
rdf:Property rdf:about=http://RDVocab.info/Elements/uniformResourceLocator;
rdfs:label xml:lang=enUniform resource locator/rdfs:label
skos:definition xml:lang=en
The address of a remote access resource.  /skos:definition
rdfs:isDefinedBy rdf:resource=http://RDVocab.info/Elements/
reg:status rdf:resource=http://metadataregistry.org/uri/RegStatus/1002/
/rdf:Property

But again, not exactly the best use of the tools at their disposal.

All this being said, it's really not too late to fix any of this,
since nobody is implementing this and, realistically, nobody ever
will.

-Ross.

On Tue, Apr 7, 2009 at 1:15 PM, Karen Coyle li...@kcoyle.net wrote:
 Ross Singer wrote:

 So, thanks to the help of my coworkers, here's the RDA Elements schema
 reformatted in an easier to read presentation:

 http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback=

 I have to say I feel like this schema is trying to both do way too
 much and subsequently loses the resource specificity that RDF would be
 providing.


 Absolutely. I think there 's a real issue that NO technology folks were
 involved in the creation of RDA. So this is data from a cataloger's
 perspective, and from the perspective of guidance rules for creating
 bibliographic data. I'm pretty sure that we can't create a viable data
 record using the RDA data elements, and I hate the idea that the data
 format, once again, is an afterthought rather than integral to the data
 creation standard.

 For one thing, it seems to reinvent a _lot_ of wheels.  Why does it
 define its own title property instead of using DC's?

 Because they wanted their own definition. Everything in the RDA element list
 has an RDA-specific meaning, which then makes it impossible to use any
 existing data properties. But there's more: RDA was defining RDA cataloging
 rules, not a schema or record format. Not only are there multiple data
 elements where one could do, there are things that are missing. For example,
 the FRBR place entity can ONLY be used as a subject, so it really means
 place as subject. There's no general place element that could be used,
 for example, in place of publication. The latter has no relationship to FRBR
 place. This is a FRBR problem as much as an RDA problem, but again FRBR
 functions at a conceptual level and doesn't really provide a schema that one
 can work with.

  By using
 properties like titleOfTheWork, dateOfWork and all of the properties
 that are specifically about TheSeries there is tremendous duplication
 of text.  If Work was its own class, you would only need say that this
 manifestation was an embodimentOf of it and reuse all of the
 title-based properties for manifestation.

 Exactly. This is what I've been saying (or trying to say) in relation to the
 bibo discussion. You should be able to use whatever properties you want with
 the FRBR classes, and not restrict data elements to a single class. This is
 a big problem in RDA, but I can say that when it was brought up to them
 (JSC) they strongly defended this choice and would not budge. RDA, to JSC,
 has a specific relationship to FRBR, and if you use a data element with a
 different FRBR class, then you are no longer doing RDA.

  What does property 'uri' mean?


 Did you look at the rdf/xml? I'm wondering if it isn't the display that's
 confusing.

 I also can't figure out how people/institutions are modeled in this
 schema, since none of the elements have ranges.  Are they their own
 resources?  If so, what?  The way it looks at a glance, they're
 strings?


 EVERYTHING is strings at the moment, with a very very few exceptions (like
 some dates, I think). Some data elements CAN use a controlled vocabulary,
 but I believe that all of those are a mixture of uncontrolled and controlled
 strings. People and institutions are mainly undefined because that is in the
 FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't
 feel it could do anything that would be too incompatible with the 'legacy'
 -- that is, with all of our AACR/MARC data.

 It seems to me that very little work was done find preexisting
 vocabularies to reuse and this schema still presents a very
 'document-centric' or 'record-centric' view of data.


 Absolutely. The catalogers are still creating a textual document, not data.
 At best you can mark up the text, as we do with the MARC record. I worry
 that we won't be able to mesh the cataloger's view with a data view -- that
 the two are some how inherently opposed. I'd like to start modeling a new
 data format but I can't imagine how we can bridge the gap between the
 catalogers and the system view. I suppose a very clever interface could hide
 the data view from the catalogers, but starting from either AACR2 or RDA and
 trying to get 

Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread David Fiander
Roy,

That's true. Unfortunately, I missed Kevin's talk at Access '02 in
Windsor, and since I wrote the first of those two papers I've mostly
been out of the loop, since it's not my area any more.

- David

On Tue, Apr 7, 2009 at 1:48 PM, Roy Tennant tenna...@oclc.org wrote:
 Well, and then you have the XOBIS work from Stanford that ksclarke was
 involved with.
 Roy


 On 4/7/09 4/7/09 € 10:41 AM, David Fiander da...@fiander.info wrote:

 On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote:
 Listen...  What you hear from over here is the sound of a very heavy sigh
 coming from a computer type who really wants to help improve the way library
 data is used in a networked environment, but they can't convince their own
 to modify the way they encode information.

 See also

 Fiander, David J. Applying XML to the Bibliographic Description.
 Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28.

 Fiander, David J., and D. Grant Campbell. An XML Definition for an
 ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4
 (2003): 29-58.

 Which is what happens when a computer type starts de novo with the
 cataloguing standards and builds simple data structures.


 --



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
It's not off-topic, at least I don't think so.

And I don't think anybody is asking to give up on catalogers.  Just
like I don't think anybody would want the technologists to describe
the materials, I think the problem is that the catalogers tried to
apply their idea of a data model into tangible technology.

Actually, I think the resource sharing argument is red herring.  A
shift to resource-centricity (vs. record-centricity) just means you
when you grab a new 'manifestation' for your local catalog, you may
also have to grab the creator, the publisher, the series, the
expression, the work, the subjects, etc.  All of these can be bundled
in the same xml document, though -- really it's just a different way
of looking at the data, but it's not a radical departure in the
delivery/discovery.

-Ross.

On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote:
 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.

 ah


 Karen Coyle wrote:

 Absolutely. The catalogers are still creating a textual document, not
 data. At best you can mark up the text, as we do with the MARC record. I
 worry that we won't be able to mesh the cataloger's view with a data view --
 that the two are some how inherently opposed. I'd like to start modeling a
 new data format but I can't imagine how we can bridge the gap between the
 catalogers and the system view. I suppose a very clever interface could hide
 the data view from the catalogers, but starting from either AACR2 or RDA and
 trying to get there feels extremely difficult. I guess my fear is that it
 will require compromises, and those will be hard to negotiate.

 kc

 p.s. The RDA element analysis is at
 http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
 That was the input to the registry.


 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Kevin S. Clarke
On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.eduwrote:

 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.


Hear, hear.  I don't think we'll see a real solution unless we consider both
the tech-folks' and the catalogers' concerns.  I'm also sympathetic to
knowledge domains wanting to have control over the meaning of their data
elements (to have a useful and well defined set).  How we move forward when
we have so much legacy data (and supporting systems), as Anna said, is a
difficult problem.

Thanks for the plug Roy.  The checks in the mail.  ;-)

Kevin

-- 
Kevin S. Clarke
Coordinator of Web Services
Belk Library  Information Commons
Appalachian State University
218 College Street
Boone, NC 28608

clark...@appstate.edu
(828) 262-8472

There are two kinds of people in the world: those who believe there are two
kinds of people and those who know better.


Re: [CODE4LIB] Something completely different

2009-04-07 Thread Genny Engel
Also back to the original question, what is an ILS in the first place?
 
The discussion has focused on bibliographic records, but that's just one part 
of what's in the ILS in use at the library where I work.  I see one of the big 
problems with current ILSs being not so much the ILS per se, but library 
managers'/librarians' expectations that they should have a single core system 
that handles all the following functionality:
 
- maintaining a database of patron records with attached fine and fee 
information, which books they have out, what is waiting on the hold shelf for 
them, etc.
 
- maintaining a library accounting hierarchy with the ability to run reports 
like it's halfway through the year and you've spent 90% of your budget for 
children's fiction
 
- maintaining an acquisitions system so records for purchases are reflected 
into the accounting system and also as new bib records for on-order materials
 
- serials check-in so that missing issues can be claimed
 
- and of course a cataloging module and an OPAC.
 
Without the ability to support all the back-end processing and accounting, 
simply replacing the front-end OPAC and the bibliographic database does nothing 
to eliminate the need for an ILS, unless it also opens the way to feed data in 
and out of cheap off-the-shelf accounting and purchasing systems that aren't 
library-specific.  A lot of libraries still won't want to put together even 
that much out of parts, and will prefer an ILS, but if it were me, I think I'd 
look at reengineering some of the parts to become more interchangeable with 
stuff like standard accounting software.
 
I must admit I was kind of horrified when I first got here and found that all 
this functionality was resident in a single system.  No wonder these things are 
so honking expensive.
 
 
 
 
Genny Engel
Sonoma County Library
gen...@sonoma.lib.ca.us 
707 545-0831 x581
www.sonomalibrary.org 
 


 njv...@wisc.edu 04/07/09 08:59AM 
On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote:

 I want to get back to simple things.  Imagine if there were no Marc records.  
 Minimal layers of abstraction.  No politics.  No vendors.  No SQL 
 straightjacket.  What would an ILS look like without those things?

Back to this original question, [...]


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
Well, there's the project by Alistair Miles that Karen alluded to earlier:

http://code.google.com/p/code4rda

The goals of this project are, in my mind, crucial in moving forward,
since it's taking our existing corpus of records and turning them into
RDA/RDF.  Not only is it a good proof of concept to show how these new
data models would look and work (esp. how they would work w/r/t to
current applications/workflows), but, more importantly, it shows it
can be done *with our current data* alleviating the need for some
unrealistic retrospective recataloging effort.

I guess the way I look at it is, there's still time to fix this, at
least technologically.  There is a difference between the standard,
the data model and the application.

Karen posted a couple of weeks back that UKMARC didn't include
punctuation, instead leaving it to technology to add it.  This doesn't
mean they didn't follow AACR2, they just didn't encode it into the
data fields, explicitly.  Of course, they gave this up when they
adopted MARC21.

Anyway, there's a separation of concerns that is currently being
blurred, but doesn't have to be in practice.

-Ross.

On Tue, Apr 7, 2009 at 2:25 PM, Anna Headley ahead...@swarthmore.edu wrote:
 But the first one to take this on has no one to grab from.  The sharing
 argument may be a red herring in that the problem, from some perspectives,
 isn't so much about sharing one's own work -- it's more about using others'
 work.  Or is there already a community of people doing something like what
 Ross describes?  If so, where can I find out more about who, and how this
 works?

 It seems to me that the best movements forward in this opening of data are
 centered on translating marc into more web-usable forms.  Which is
 great**... for everyone except catalogers with no love for marc.  Jakob
 makes a good point in the post that Rob pointed out
 (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when
 cataloging can look like librarything, the rules *and, I would add, tools*
 we use seem incredibly bloated.

 ** I do mean great.  We have to start somewhere.  It's just that the
 cataloging pieces move so excruciatingly slowly.

 ah




 Ross Singer wrote:

 It's not off-topic, at least I don't think so.

 And I don't think anybody is asking to give up on catalogers.  Just
 like I don't think anybody would want the technologists to describe
 the materials, I think the problem is that the catalogers tried to
 apply their idea of a data model into tangible technology.

 Actually, I think the resource sharing argument is red herring.  A
 shift to resource-centricity (vs. record-centricity) just means you
 when you grab a new 'manifestation' for your local catalog, you may
 also have to grab the creator, the publisher, the series, the
 expression, the work, the subjects, etc.  All of these can be bundled
 in the same xml document, though -- really it's just a different way
 of looking at the data, but it's not a radical departure in the
 delivery/discovery.

 -Ross.

 On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu
 wrote:


 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do
 you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who
 are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what
 can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.

 ah


 Karen Coyle wrote:


 Absolutely. The catalogers are still creating a textual document, not
 data. At best you can mark up the text, as we do with the MARC record. I
 worry that we won't be able to mesh the cataloger's view with a data
 view --
 that the two are some how inherently opposed. I'd like to start modeling
 a
 new data format but I can't imagine how we can bridge the gap between
 the
 catalogers and the system view. I suppose a very clever interface could
 hide
 the data view from the catalogers, but starting from either AACR2 or RDA
 and
 trying to get there feels extremely difficult. I guess my fear is that
 it
 will require compromises, and those will be hard to negotiate.

 kc

 p.s. The RDA element analysis is at

 http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
 That was the input to the registry.



 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



Re: [CODE4LIB] Something completely different

2009-04-07 Thread Sharon Foster
Which is why the interface specifications are at least as important,
if not more important, as the specs for each of the modules that you
enumerated. If the interfaces are well-defined, then the components
can be designed and developed with a minimum of further interactions
among developers. In fact, there might eventually be more than one
implementation of a particular module, allowing a library to assemble
an ILS out of interchangeable components. (I'm assuming open
source--it seems unlikely that proprietary vendors will ever come
around.)

Sharon M. Foster, 91.7% Librarian
Speaker-to-Computers
http://www.vsa-software.com/mlsportfolio/






On Tue, Apr 7, 2009 at 2:52 PM, Genny Engel gen...@sonoma.lib.ca.us wrote:
 Also back to the original question, what is an ILS in the first place?
[...]
 Without the ability to support all the back-end processing and accounting, 
 simply replacing the front-end OPAC and the bibliographic database does 
 nothing to eliminate the need for an ILS, unless it also opens the way to 
 feed data in and out of cheap off-the-shelf accounting and purchasing systems 
 that aren't library-specific.  A lot of libraries still won't want to put 
 together even that much out of parts, and will prefer an ILS, but if it were 
 me, I think I'd look at reengineering some of the parts to become more 
 interchangeable with stuff like standard accounting software.

 I must admit I was kind of horrified when I first got here and found that all 
 this functionality was resident in a single system.  No wonder these things 
 are so honking expensive.




 Genny Engel
 Sonoma County Library
 gen...@sonoma.lib.ca.us
 707 545-0831 x581
 www.sonomalibrary.org



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Anna Headley
But the first one to take this on has no one to grab from.  The sharing 
argument may be a red herring in that the problem, from some 
perspectives, isn't so much about sharing one's own work -- it's more 
about using others' work.  Or is there already a community of people 
doing something like what Ross describes?  If so, where can I find out 
more about who, and how this works?


It seems to me that the best movements forward in this opening of data 
are centered on translating marc into more web-usable forms.  Which is 
great**... for everyone except catalogers with no love for marc.  Jakob 
makes a good point in the post that Rob pointed out 
(http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... 
when cataloging can look like librarything, the rules *and, I would add, 
tools* we use seem incredibly bloated.


** I do mean great.  We have to start somewhere.  It's just that the 
cataloging pieces move so excruciatingly slowly.


ah




Ross Singer wrote:

It's not off-topic, at least I don't think so.

And I don't think anybody is asking to give up on catalogers.  Just
like I don't think anybody would want the technologists to describe
the materials, I think the problem is that the catalogers tried to
apply their idea of a data model into tangible technology.

Actually, I think the resource sharing argument is red herring.  A
shift to resource-centricity (vs. record-centricity) just means you
when you grab a new 'manifestation' for your local catalog, you may
also have to grab the creator, the publisher, the series, the
expression, the work, the subjects, etc.  All of these can be bundled
in the same xml document, though -- really it's just a different way
of looking at the data, but it's not a radical departure in the
delivery/discovery.

-Ross.

On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote:
  

And what you hear over here is a plea to not give up on catalogers.  Some
are beyond ready to move from text to data.  Hiding the data view -- do you
mean making it look like marc? -- sounds pretty awful.  Catalogers who are
on board are trapped by the way sharing currently works, i.e. record
sharing.  If the leaders of the cataloging community are failing, what can
catalogers do?  This is an honest question, not a throwing-up-of-hands.
 Though maybe completely off-topic for this list.

ah


Karen Coyle wrote:


Absolutely. The catalogers are still creating a textual document, not
data. At best you can mark up the text, as we do with the MARC record. I
worry that we won't be able to mesh the cataloger's view with a data view --
that the two are some how inherently opposed. I'd like to start modeling a
new data format but I can't imagine how we can bridge the gap between the
catalogers and the system view. I suppose a very clever interface could hide
the data view from the catalogers, but starting from either AACR2 or RDA and
trying to get there feels extremely difficult. I guess my fear is that it
will require compromises, and those will be hard to negotiate.

kc

p.s. The RDA element analysis is at
http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
That was the input to the registry.

  

--
Anna Headley
Swarthmore College Library
610.690.5781
ahead...@swarthmore.edu




--
Anna Headley
Swarthmore College Library
610.690.5781
ahead...@swarthmore.edu 


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Karen Coyle

Ross Singer wrote:

Well, there's the project by Alistair Miles that Karen alluded to earlier:

http://code.google.com/p/code4rda

The goals of this project are, in my mind, crucial in moving forward,
since it's taking our existing corpus of records and turning them into
RDA/RDF.  Not only is it a good proof of concept to show how these new
data models would look and work (esp. how they would work w/r/t to
current applications/workflows), but, more importantly, it shows it
can be done *with our current data* alleviating the need for some
unrealistic retrospective recataloging effort.

I guess the way I look at it is, there's still time to fix this, at
least technologically.  There is a difference between the standard,
the data model and the application.
  


An interesting experiment would be to attempt to use the cataloger's use 
cases that Alistair worked from, but instead of using the RDA vocabulary 
to use bibo+vocab.org/frbr. That would give us something comparative to 
look at. If bibo+frbr can do all or even a lot of what RDA does, then we 
can demonstrate a different model and explain why one is better than the 
other (or at least that more than one model will work).


kc

--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



[CODE4LIB] Cyberinfrastructure Summer Internships for repository interoperability: application deadline reminder

2009-04-07 Thread Hilmar Lapp

*** Please disseminate widely to students at your institution ***

CYBERINFRASTRUCTURE SUMMER INTERNSHIPS 2009 -
  REMINDER: Student Application Deadline is April 13, 2009

http://hackathon.nescent.org/ 
Cyberinfrastructure_Summer_Traineeships_2009


Summer training internships are available for up to four students and  
postdocs interested in informatics as applied to scientific data in  
such fields as biodiversity, ecology, and evolutionary biology. The  
program provides a unique opportunity for undergraduate, masters, and  
PhD students as well as postdocs to obtain hands-on experience  
writing and extending open-source software as part of a distributed  
collaborative software development team building a Virtual Data  
Center (VDC) that includes major data and metadata repositories in  
those fields.


The application deadline for students (April 13, 2009) is approaching  
rapidly.


Trainees accepted into the program will receive a stipend ($4,500),  
and with the exception of attending one meeting near the beginning  
and one near the end of the 3-month program period may work from  
their home, or home institution. Travel costs incurred in connection  
with the meetings will be reimbursed. Each student will have at least  
one dedicated mentor to show them the ropes and help them complete  
their project.


Initial project ideas are listed on the website. These range from  
validation of metadata and identifier resolution, to supporting LSID  
and semantic-web compliant PURLs for digital data objects, to  
implementing modern web-service APIs, to cataloging the diversity of  
metadata schemas. The project ideas are flexible and can be adjusted  
in scope to match the skills of the student. We also welcome novel  
project ideas that dovetail with student interests.


The program is supported by a National Science Foundation (NSF) grant  
to a consortium of major repositories for biodiversity, earth and  
environmental, ecological, and evolutionary science. The consortium  
includes the LTER Network Office, the U.S. Geological Survey, NASA  
and Oak Ridge National Laboratory, the Global Biodiversity  
Information Facility (GBIF), the National Evolutionary Synthesis  
Center(NESCent), and the National Center for Ecological Analysis and  
Synthesis (NCEAS). It aims to develop the cyberinfrastructure and  
technologies necessary to build a Virtual Data Center (VDC) based on  
a network of existing and new physical repositories (nodes) that  
interoperate using open standards and protocols. The network will  
enable discovery of as well as open, stable, and secure access to  
data in any of its member nodes.


TO APPLY: Students apply online. Instructions for applying are at the  
website (see When you apply), along with program rules and  
eligibility requirements.  The 15-day application period for students  
end on Monday, April 13th, 2009.



INQUIRIES: vdc-twg {at} ecoinformatics {dot} org. We strongly  
encourage all interested students to get in touch with us with their  
ideas as early as possible.


Cyberinfrastructure Traineeships Website:
http://hackathon.nescent.org/ 
Cyberinfrastructure_Summer_Traineeships_2009


To sign up for quarterly NESCent newsletters: http://www.nescent.org/ 
about/contact.php


-

Todd Vision and Hilmar Lapp
National Evolutionary Synthesis Center
http://nescent.org


Re: [CODE4LIB] registering info: uris?

2009-04-07 Thread Eric Hellman
no, that's not at all what it implies. the ofi/name identifiers were  
minted as identifiers for namespaces of indentifiers, not as a wrapper  
scheme for the identifiers themselves. Yes, it's a bit TOO meta, but  
they can be safely ignored unless a new profile is desired.



On Apr 5, 2009, at 10:31 AM, Karen Coyle wrote:


Jonathan Rochkind wrote:


URI for an ISBN or SuDocs?  I don't think the GPO is going  
anywhere, but the GPO isn't committing to supporting an http URI  
scheme, and whoever is, who knows if they're going anywhere. That  
issue is certainly mitigated by Ross using purl.org for these,  
instead of his own personal http URI. But another issue that makes  
us want a controlling authority is increasing the chances that  
everyone will use the _same_ URI.  If GPO were behind the purl.org/ 
NET/sudoc URIs, those chances would be high. Just Ross on his own,  
the chances go down, later someone else (OCLC, GPO, some other guy  
like Ross) might accidentally create a 'competitor', which would be  
unfortunate. Note this isn't as much of a problem for born web  
resources -- nobody's going to accidentally create an alternate URI  
for a dbpedia term, because anybody that knows about dbpedia knows  
that it lives at dbpedia.


So those are my thoughts. Now everyone else can argue bitterly over  
them for a while. :)


The ones that really puzzle me, however, are the OpenURL info  
namespace URIs for ftp, http, https and info. This implies that  
EVERY identifier used by OpenURL needs an info URI, even if it is a  
URI in its own right. They are under info:ofi/nam which is called  
Namespace reserved for registry identifiers of namespaces. There's  
something so circular about this that I just get a brain dump when I  
try to understand it. Does it make sense to anyone?


kc


--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



Eric Hellman
http://hellman.net/eric/


Re: [CODE4LIB] Something completely different

2009-04-07 Thread Peter Schlumpf
An interesting thread!  It will take me a while for me to digest the ideas.

What I had in mind for something different is this:  Think of a single database 
of only associations between objects, and nothing more than that.  Objects 
defined in this database can reference any and all other objects in the 
database.  These objects could represent anything:  Title records or item 
records in an opac.  A collection of files on a computer.  Web sites.  Links.  
Database queries.  All of the above.  Each object in this database contains 
just enough information to say that it exists and has a pointer to the thing in 
the outside world that it represents.

Although the basic system would allow the objects in it to link to eachother in 
arbitrary ways, we could impose rules on it to create a system.  An OPAC.  A 
map. Other things that I can't think of right now.  I think a key thought here 
is that it is a database of pure relationships that can be set up and 
manipulated.  But the descriptive data is stored elsewhere.

It allows for an interesting extension too -- weighting those associations.  
Suppose we use it to create a search structure, and each time we go from one 
object referencing another we increment a counter for that link by one.

There are many ways to implement something like this, and I have one in mind, 
but this is sort of the theory behind it.  It is going back to simple things.

Peter Schlumpf

-Original Message-
From: Karen Coyle li...@kcoyle.net
Sent: Apr 6, 2009 1:49 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Something completely different

Cloutman, David wrote:
 I'm open to seeing new approaches to the ILS in general. A related
 question I had the other day, speaking of MARC, is what would an
 alternative bibliographic data format look like if it was designed with
 the intent for opening access to the data our ILS systems to developers
 in a more informal manner? I was thinking of an XML format that a
 developer could work with without formal training, 

Well, speaking of 'without formal training' -- I posted this to the Open 
Library technology list, but using the OL, which is triple-based and 
open access, I was able to create a simple demo Pipe of how you could 
determine the earliest date of publication of a book (with an interest 
in looking at potential copyright status). Caveat is that the API I'm is 
still pretty stubby, so it only retrieves on exact title (this will be 
fixed sometime in the future).

The pipe is here:

http://pipes.yahoo.com/pipes/pipe.info?_id=216efa8c3b04764ca77ad181b1cc66e4

kc

 the basics of which
 could be learned in an hour, and could reasonably represent the
 essential fields of the 90% of records that are most likely to be viewed
 by a public library patron. In my mind, such a format would allow
 creators of community-based web sites to pull data from their local
 library, and repurpose it without having to learn a lot of arcane
 formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice,
 of course, would be loosing some of the richness MARC allows, but I
 think in many common situations the really complex records are not what
 patrons are interested in. You may want to consider prototyping this in
 your application. I see such an effort to be vital in making our systems
 relevant in future computing environments, and I am skeptical that a
 simple, workable solution would come out the initial efforts of a
 standardization committee.

 Just my 2 cents.

 - David

 ---
 David Cloutman dclout...@co.marin.ca.us
 Electronic Services Librarian
 Marin County Free Library 

 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Peter Schlumpf
 Sent: Sunday, April 05, 2009 8:40 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Something completely different


 Greetings!

 I have been lurking on (or ignoring) this forum for years.  And
 libraries too.  Some of you may know me.  I am the Avanti guy.  I am,
 perhaps, the first person to try to produce an open source ILS back in
 1999, though there is a David Duncan out there who tried before I did. I
 was there when all this stuff was coming together.

 Since then I have seen a lot of good things happen.  There's Koha.
 There's Evergreen.  They are good things.  I have also seen first hand
 how libraries get screwed over and over by commercial vendors with their
 crappy software.  I believe free software is the answer to that.  I have
 neglected Avanti for years, but now I am ready to return to it.

 I want to get back to simple things.  Imagine if there were no Marc
 records.  Minimal layers of abstraction.  No politics.  No vendors.  No
 SQL straightjacket.  What would an ILS look like without those things?
 Sometimes the biggest prison is between the ears.

 I am in a position to do this now, and that's what I have decided to do.
 I am getting busy.

 Peter Schlumpf

 Email Disclaimer: