Technically it's 4 communities, but, yes, only two currently have "credible" registries in place.
-Ross. On Thu, Apr 30, 2009 at 9:28 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote: > Crosswalk is exactly the wrong answer for this. Two very small overlapping > communities of most library developers can surely agree on using the same > identifiers, and then we make things easier for US. We don't need to solve > the entire universe of problems. Solve the simple problem in front of you in > the simplest way that could possibly work and still leave room for future > expansion and improvement. From that, we learn how to solve the big problems, > when we're ready. Overreach and try to solve the huge problem including every > possible use case, many of which don't apply to you but SOMEDAY MIGHT... and > you end up with the kind of over-abstracted over-engineered > too-complicated-to-actually-catch-on solutions that... we in the library > community normally end up with. > ________________________________________ > From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Peter Noerr > [pno...@museglobal.com] > Sent: Thursday, April 30, 2009 6:37 PM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule > Them All > > Some further observations. So far this threadling has mentioned only trying > to unify two different sets of identifiers. However there are a much larger > number of them out there (and even larger numbers of schemas and other > "standard-things-that-everyone-should-use-so-we-all-know-what-we-are-talking-about") > and the problem exists for any of these things (identifiers, etc.) where > there are more than one of them. So really unifying two sets of identifiers, > while very useful, is not actually going to solve much. > > Is there any broader methodology we could approach which potentially allows > multiple unifications or (my favourite) cross-walks. (Complete unification > requires everybody agrees and sticks to it, and human history is sort of not > on that track...) And who (people and organizations) would undertake this? > > Ross' point about a lightweight approach is necessary for any sort of > adoption, but this is a problem (which plagues all we do in federated search) > which cannot just be solved by another registry. Somebody/organisation has to > look at the identifiers or whatever and decide that two of them are identical > or, worse, only partially overlap and hence scope has to be defined. In a > syntax that all understand of course. Already in this thread we have the > sub/super case question from Karen (in a post on the openurl (or Z39.88 > <sigh> - identifiers!) listserv). And the various identifiers for MARC > (below) could easily be for MARC-XML, MARC21-ISO2709, MARCUK-ISO2709. Now > explain in words of one (computer understandable) syllable what the > differences are. > > I'm not trying to make problems. There are problems and this is only a small > subset of them, and they confound us every day. I would love to adopt > standard definitions for these things, but which Standard? Because anyone can > produce any identifier they like, we have decided that the unification of > them has to be kept internal where we at least have control of the > unifications, even if they change pretty frequently. > > Peter > > > Dr Peter Noerr > CTO, MuseGlobal, Inc. > > +1 415 896 6873 (office) > +1 415 793 6547 (mobile) > www.museglobal.com > > >> -----Original Message----- >> From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of >> Ross Singer >> Sent: Thursday, April 30, 2009 12:00 >> To: CODE4LIB@LISTSERV.ND.EDU >> Subject: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them >> All >> >> Hello everybody. I apologize for the crossposting, but this is an >> area that could (potentially) affect every one of these groups. I >> realize that not everybody will be able to respond to all lists, >> but... >> >> First of all, some back story (Code4Lib subscribers can probably skip >> ahead): >> >> Jangle [1] requires URIs to explicitly declare the format of the data >> it is transporting (binary marc, marcxml, vcard, DLF >> simpleAvailability, MODS, EAD, etc.). In the past, it has used it's >> own URI structure for this (http://jangle.org/vocab/formats#...) but >> this was always been with the intention of moving out of the >> jangle.org into a more "generic" space so it could be used by other >> initiatives. >> >> This same concept came up in UnAPI [2] (I think this thread: >> http://old.onebiglibrary.net/yale/cipolo/gcs-pcs-list/2006- >> March/thread.html#682 >> discusses it a bit - there is a reference there that it maybe had come >> up before) although was rejected ultimately in favor of an (optional) >> approach more in line with how OAI-PMH disambiguates metadata formats. >> That being said, this page used to try to set sort of convention >> around the UnAPI formats: >> http://unapi.stikipad.com/unapi/show/existing+formats >> But it's now just a squatter page. >> >> Jakob Voss pointed out that SRU has a schema registry and that it >> would make sense to coordinate with this rather than mint new URIs for >> things that have already been defined there: >> http://www.loc.gov/standards/sru/resources/schemas.html >> >> This, of course, made a lot of sense. It also made me realize that >> OpenURL *also* has a registry of metadata formats: >> http://alcme.oclc.org/openurl/servlet/OAIHandler?verb=ListRecords&metadataP >> refix=oai_dc&set=Core:Metadata+Formats >> >> The problem here is that OpenURL and SRW are using different info URIs >> to describe the same things: >> >> info:srw/schema/1/marcxml-v1.1 >> >> info:ofi/fmt:xml:xsd:MARC21 >> >> or >> >> info:srw/schema/1/onix-v2.0 >> >> info:ofi/fmt:xml:xsd:onix >> >> The latter technically isn't the same thing since the OpenURL one >> claims it's an identifier for ONIX 2.1, but if I wasn't sending this >> email now, eventually SRU would have registered >> info:srw/schema/1/onix-v2.1 >> >> There are several other examples, as well (MODS, ISO20775, etc.) and >> it's not a stretch to envision more in the future. >> >> So there are a couple of questions here. >> >> First, and most importantly, how do we reconcile these different >> identifiers for the same thing? Can we come up with some agreement on >> which ones we should really use? >> >> Secondly, and this gets to the reason why any of this was brought up >> in the first place, how can we coordinate these identifiers more >> effectively and efficiently to reuse among various specs and >> protocols, but not: >> 1) be tied to a particular community >> 2) require some laborious and lengthy submission and review process to >> just say "hey, here's my FOAF available via UnAPI" >> 3) be so lax that it throws all hope of authority out the window >> ? >> >> I would expect the various communities to still maintain their own >> registries of "approved" data formats (well, OpenURL and SRU, anyway >> -- it's not as appropriate to UnAPI or Jangle). >> >> Does something like this interest any of you? Is there value in such >> an initiative? >> >> Thanks, >> -Ross. >