Re: [RDA-L] RDA/Dublin Core

2011-04-25 Thread Diane I. Hillmann

 Mark & Judie:

Let me try to clarify some of this--I agree that it can be very confusing.

Dublin Core and the RDA Vocabularies are separate element vocabularies, 
and either one can be used by a digital asset management system 
(normally the system you choose already comes with something someone 
else has chosen).  Dublin Core has been 'finished' for some time and is 
much simpler, and many institutions and groups have used it and 
published freely available guidance for doing so (as has DCMI). The RDA 
Vocabularies are more complex than DC, can be used with or without the 
FRBR model, and are currently in the process of review. Although there 
are very detailed instructions for creating RDA records they are only 
available via license (online) or purchase (hard copy).  Both 
vocabularies can be used to create XML descriptions, as well as with 
other encoding methods (RDF, RDFa, etc.), but that sort of help is a bit 
more accessible for DC at this stage.


There is not now a publicly available mapping between DC and RDA but 
there will likely be some appearing within a year or so.  There are 
definitely no current plans to change DC to accommodate RDA--that's 
extremely unlikely to happen (ever) and not necessary in any case.


These are not easy issues to describe or explain, and there's a lot of 
misinformation being shared 'out there' with good intentions but not 
much understanding.  If you really need help with this, you might want 
to suggest to your institution that buying some consultation time from 
someone who does know what they're doing is by far the cheapest 
alternative for getting a good start with a project (recent graduation 
from library school is not a guarantee of competence in this area, since 
most schools don't teach this stuff).  If you want to talk to me 
specifically about recommendations for consultants in your area, that 
conversation is better held offlist.


Regards,
Diane Hillmann
co-chair, DCMI/RDA Task Group


On 4/25/11 6:51 PM, Mark Rose wrote:

The XML of Dublin Core would relate to how the information was recorded only. 
Since you plan on using a digital asset management system, I am assuming there 
is some database in place that would record the DC metadata elements. There is 
a DC task group looking at incorporating RDA elements into RDA 
http://dublincore.org/dcmirdataskgroup/. From what I understand, DC will have 
to be altered to account for the RDA elements. DC uses RDF vocabularies.

Mark Rose, B.A.Hons., M.I.St.
Librarian and Information Systems Manager
ICURR = Cirur
mr...@icurr.org
(647) 345-7004



-Original Message-
From: Resource Description and Access / Resource Description and Access on 
behalf of Cooper, Judith K.
Sent: Mon 4/25/2011 5:25 PM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: [RDA-L] RDA/Dublin Core

Hi,
I'm trying to set up a dam system for all of our photos, documents, 
publications and things that we have. Basically catalog it all. We aren't going 
to use MARC and are looking at Dublin Core, but really great explanations of 
how this works especially in conjunction  with RDA are not easy to find nor are 
examples. Or else I am not looking in the right places. I don't need to set it 
up using xml and can find no examples involving RDA and DC that do not also 
involve xml. Can anyone offer me some advice or am I just totally lost since my 
last cataloging class was 20 years ago? Thanks for any help that anyone can 
give me I have looked at all of the basics at all of the RDA and DC sites and 
the ones that people were suggesting on here recently for FRBR but they weren't 
giving me exactly what I needed.

Judie Cooper
Librarian
Extension and Agricultural Information
University of Missouri
1-98 Agriculture Building
Columbia, MO 65211

Phone: 573-884-9743
Fax: 573-882-8007
E-mail: coope...@missouri.edu
Web: http://extension.missouri.edu





.



Re: [RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread Schutt, Misha
J. McRee Elrod writes:
> Gene Fieg said:
>
> >Not to include certain fields, whether variable or fixed, does a disservice
> >to the patron who might be looking for specific types of information in
> >those books.
>
> Some standards should be considered as very low floors, not ceilings,
> for what we should be doing.

And yet, once a PCC library has added that damnable 042 ("ne cambietur" as 
opposed to "imprimatur"?), we who have not attained Enhance status can no 
longer add the fields that they in their unreliable wisdom have deemed 
unnecessary. (Of course, that's an OCLC Expert Community issue, not a BIBCO 
one.)

Misha Schutt
Catalog Librarian
Burbank (Calif.) Public Library
(818) 238 5570
msch...@ci.burbank.ca.us
www.burbanklibrary.com


[RDA-L] Workshop "Changes from AACR2 to RDA"

2011-04-25 Thread Newell,Rick
OCLC is pleased to offer a webinar series "Changes from AACR2 to RDA."
This webinar uses side-by-side examples in MARC format to show the most
significant changes between AACR2 and RDA cataloging practices.  The
webinar is presented in two parts: Part 1 covers description, and Part 2
covers access points.  

 

The presenter is Adam Schiff, Principal Cataloger at University of
Washington Libraries.  He has given this presentation at a number of
conferences; see his web page at http://faculty.washington.edu/aschiff/.
Mr. Schiff chaired one of the two JSC RDA Example Groups, focused on the
RDA chapters for identifying works and expressions, persons, families,
corporate bodies, and places; and on recording relationships.  He also
chaired CC:DA in 2000-2001.

 

The schedule for these workshops is:

Changes from AACR2 to RDA. Part 1: Description.  May 11, 2011, 1:00
PM-2:30 PM EDT

Changes from AACR2 to RDA. Part 2: Access Points. May 12, 2011, 1:00
PM-2:30 PM EDT

Changes from AACR2 to RDA. Part 1: Description.  July 13, 2011, 1:00
PM-2:30 PM EDT

Changes from AACR2 to RDA. Part 2: Access Points. July 14, 2011, 1:00
PM-2:30 PM EDT

 

Prerequisites include knowledge of AACR2 and MARC bibliographic and
authority formats; and familiarity with FRBR concepts, particularly the
FRBR Group 1 entities of work, expression, manifestation, and item.  For
background information about RDA, see
http://www.oclc.org/us/en/rda/about.htm.  For background information
about FRBR, see http://www.loc.gov/cds/downloads/FRBR.PDF

 

To register for these workshops, please visit the OCLC Training Portal
at http://training.oclc.org/training; then under Cataloging and
Metadata, select RDA.  

 

***

Rick Newell

Senior Training Coordinator

OCLC

T 800-848-5800 +1 +1

E  train...@oclc.org

**

 



Re: [RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread J. McRee Elrod
Gene Fieg said:

>Not to include certain fields, whether variable or fixed, does a disservice
>to the patron who might be looking for specific types of information in
>those books.
 
Some standards should be considered as very low floors, not ceilings,
for what we should be doing.
 
Omission of fields, in addition to the too general SMD "online
resource" (the "S" does stand for specific), is my problem with the
PCC provider neutral (PN) electronic record standard.  Among fields
omitted are 010$z, 506, and 530.

While it is good to describe the item on screen (as opposed to the
print original as the LCRI would have us do), the PN standard ignores
aggrigator enhancements, differences such as presence or absence of
illustrations, illustrations in col. or b&w, etc.

It remains to be seen what adjustments will be make to the PCC PN
standard if/when RDA is implemented.  At least I hope they will adopt
the RDA exact unit name option, rather than repeating a 338 general
term in 300.  (An online resource might be a website or streaming
video, as well as an e-book.)  RDA does mention "digital file" as a
possible unit name.


   __   __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__


Re: [RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread Jonathan Rochkind
So your argument is that every single possible field must be created to 
be the briefest informative record possible?  Really?


Regardless, that's an argument to take up with bibco/PCC I guess.

Apparently they decided that not every single possible field was 
neccesary for a BIBCO Standard Record, and that the fields you are 
missing were not neccesary.  You could take it up with them to argue 
that either every single possible field should be mandated filled out 
for a BIBCO Standard Record, or even if not every single field, then 
some of the fields they have not mandated for BIBCO Standard Record 
ought to have been mandated. (That would presumably require more of a 
supporting argument for why those fields in particular ought to be 
mandated then "not to include any possible field is a disservice to our 
patrons." Although I guess that would be the former argument, that a 
BIBCO Standard Record ought to mandate that every possibly applicable 
MARC field be filled out if applicable. They clearly chose another path. )


On 4/25/2011 6:37 PM, Gene Fieg wrote:

Thanks for the documents on bibco records.
Not to include certain fields, whether variable or fixed, does a 
disservice to the patron who might be looking for specific types of 
information in those books.  Our goal should not only create the 
briefest record possible, but the briefest informative record 
possible.  Not including bibl. references does not fulfill that criterion.


On Mon, Apr 25, 2011 at 1:37 PM, Adam L. Schiff 
mailto:asch...@u.washington.edu>> wrote:


This record is coded as a BIBCO record.

The BIBCO Standard Record does not require the bibliographical
references and indexes note(s) nor most fixed fields to be filled
in.  The particular fields that Mr. Fieg criticizes as lacking are
not required for PCC records.

Please see the BIBCO Standard Record documentation at
http://www.loc.gov/catdir/pcc/bibco/BSR-MAPS.html  The textual
monographs metadata application profile is at
http://www.loc.gov/catdir/pcc/bibco/BSR_TM_3Sept-2010.pdf

Mr. Fieg's criticism of this record has nothing whatsoever to do
with deficiencies of RDA or even with cataloger error, since this
record fulfills the BIBCO Standard Record floor requirements.  One
may argue the merits and defaults of the specific requirements of
the standard, but those arguments were already held within the
PCC, and that is really for a completely different list than this
one.  Suffice it to say that the fields missing or uncoded that
Mr. Fieg complains about were not deemed essential elements needed
to support user tasks to find, identify, select, and obtain.  See
the Final Report of the Task Group on BIBCO Standard Record
Requirements at
http://www.loc.gov/catdir/pcc/bibco/BSR-Final-Report.pdf

Adam Schiff

^^
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
Box 352900
Seattle, WA 98195-2900
(206) 543-8409
(206) 685-8782 fax
asch...@u.washington.edu 
http://faculty.washington.edu/~aschiff

~~


On Mon, 25 Apr 2011, Gene Fieg wrote:

OCLC record: *690085810: Fixed field for index should be
marked as "1".
It also should have 504 stating that it contains
bibliographical references and index.

I guess we are too busy adding fields 336-338 and forgetting
what may be truly useful to the patron.

--
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu 




--
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu 


Re: [RDA-L] RDA/Dublin Core

2011-04-25 Thread Mark Rose
The XML of Dublin Core would relate to how the information was recorded only. 
Since you plan on using a digital asset management system, I am assuming there 
is some database in place that would record the DC metadata elements. There is 
a DC task group looking at incorporating RDA elements into RDA 
http://dublincore.org/dcmirdataskgroup/. From what I understand, DC will have 
to be altered to account for the RDA elements. DC uses RDF vocabularies.

Mark Rose, B.A.Hons., M.I.St.
Librarian and Information Systems Manager
ICURR = Cirur
mr...@icurr.org
(647) 345-7004



-Original Message-
From: Resource Description and Access / Resource Description and Access on 
behalf of Cooper, Judith K.
Sent: Mon 4/25/2011 5:25 PM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: [RDA-L] RDA/Dublin Core
 
Hi,
I'm trying to set up a dam system for all of our photos, documents, 
publications and things that we have. Basically catalog it all. We aren't going 
to use MARC and are looking at Dublin Core, but really great explanations of 
how this works especially in conjunction  with RDA are not easy to find nor are 
examples. Or else I am not looking in the right places. I don't need to set it 
up using xml and can find no examples involving RDA and DC that do not also 
involve xml. Can anyone offer me some advice or am I just totally lost since my 
last cataloging class was 20 years ago? Thanks for any help that anyone can 
give me I have looked at all of the basics at all of the RDA and DC sites and 
the ones that people were suggesting on here recently for FRBR but they weren't 
giving me exactly what I needed.
 
Judie Cooper
Librarian
Extension and Agricultural Information
University of Missouri
1-98 Agriculture Building
Columbia, MO 65211

Phone: 573-884-9743
Fax: 573-882-8007
E-mail: coope...@missouri.edu
Web: http://extension.missouri.edu  


 
 
 


Re: [RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread Gene Fieg
Thanks for the documents on bibco records.

Not to include certain fields, whether variable or fixed, does a disservice
to the patron who might be looking for specific types of information in
those books.  Our goal should not only create the briefest record possible,
but the briefest informative record possible.  Not including bibl.
references does not fulfill that criterion.

On Mon, Apr 25, 2011 at 1:37 PM, Adam L. Schiff wrote:

> This record is coded as a BIBCO record.
>
> The BIBCO Standard Record does not require the bibliographical references
> and indexes note(s) nor most fixed fields to be filled in.  The particular
> fields that Mr. Fieg criticizes as lacking are not required for PCC records.
>
> Please see the BIBCO Standard Record documentation at
> http://www.loc.gov/catdir/pcc/bibco/BSR-MAPS.html  The textual monographs
> metadata application profile is at
> http://www.loc.gov/catdir/pcc/bibco/BSR_TM_3Sept-2010.pdf
>
> Mr. Fieg's criticism of this record has nothing whatsoever to do with
> deficiencies of RDA or even with cataloger error, since this record fulfills
> the BIBCO Standard Record floor requirements.  One may argue the merits and
> defaults of the specific requirements of the standard, but those arguments
> were already held within the PCC, and that is really for a completely
> different list than this one.  Suffice it to say that the fields missing or
> uncoded that Mr. Fieg complains about were not deemed essential elements
> needed to support user tasks to find, identify, select, and obtain.  See the
> Final Report of the Task Group on BIBCO Standard Record Requirements at
> http://www.loc.gov/catdir/pcc/bibco/BSR-Final-Report.pdf
>
> Adam Schiff
>
> ^^
> Adam L. Schiff
> Principal Cataloger
> University of Washington Libraries
> Box 352900
> Seattle, WA 98195-2900
> (206) 543-8409
> (206) 685-8782 fax
> asch...@u.washington.edu
> http://faculty.washington.edu/~aschiff
> ~~
>
>
> On Mon, 25 Apr 2011, Gene Fieg wrote:
>
> OCLC record: *690085810: Fixed field for index should be marked as "1".
>> It also should have 504 stating that it contains bibliographical
>> references and index.
>>
>> I guess we are too busy adding fields 336-338 and forgetting what may be
>> truly useful to the patron.
>>
>> --
>> Gene Fieg
>> Cataloger/Serials Librarian
>> Claremont School of Theology
>> gf...@cst.edu
>>
>>


-- 
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu


[RDA-L] RDA/Dublin Core

2011-04-25 Thread Cooper, Judith K.
Hi,
I'm trying to set up a dam system for all of our photos, documents, 
publications and things that we have. Basically catalog it all. We aren't going 
to use MARC and are looking at Dublin Core, but really great explanations of 
how this works especially in conjunction  with RDA are not easy to find nor are 
examples. Or else I am not looking in the right places. I don't need to set it 
up using xml and can find no examples involving RDA and DC that do not also 
involve xml. Can anyone offer me some advice or am I just totally lost since my 
last cataloging class was 20 years ago? Thanks for any help that anyone can 
give me I have looked at all of the basics at all of the RDA and DC sites and 
the ones that people were suggesting on here recently for FRBR but they weren't 
giving me exactly what I needed.

Judie Cooper
Librarian
Extension and Agricultural Information
University of Missouri
1-98 Agriculture Building
Columbia, MO 65211

Phone: 573-884-9743
Fax: 573-882-8007
E-mail: coope...@missouri.edu
Web: http://extension.missouri.edu







Re: [RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread Adam L. Schiff

This record is coded as a BIBCO record.

The BIBCO Standard Record does not require the bibliographical references 
and indexes note(s) nor most fixed fields to be filled in.  The particular 
fields that Mr. Fieg criticizes as lacking are not required for PCC 
records.


Please see the BIBCO Standard Record documentation at 
http://www.loc.gov/catdir/pcc/bibco/BSR-MAPS.html  The textual monographs 
metadata application profile is at 
http://www.loc.gov/catdir/pcc/bibco/BSR_TM_3Sept-2010.pdf


Mr. Fieg's criticism of this record has nothing whatsoever to do with 
deficiencies of RDA or even with cataloger error, since this record 
fulfills the BIBCO Standard Record floor requirements.  One may argue the 
merits and defaults of the specific requirements of the standard, but 
those arguments were already held within the PCC, and that is really for a 
completely different list than this one.  Suffice it to say that the 
fields missing or uncoded that Mr. Fieg complains about were not deemed 
essential elements needed to support user tasks to find, identify, select, 
and obtain.  See the Final Report of the Task Group on BIBCO Standard 
Record Requirements at 
http://www.loc.gov/catdir/pcc/bibco/BSR-Final-Report.pdf


Adam Schiff

^^
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
Box 352900
Seattle, WA 98195-2900
(206) 543-8409
(206) 685-8782 fax
asch...@u.washington.edu
http://faculty.washington.edu/~aschiff
~~

On Mon, 25 Apr 2011, Gene Fieg wrote:


OCLC record: *690085810: Fixed field for index should be marked as "1".
It also should have 504 stating that it contains bibliographical references and 
index.
 
I guess we are too busy adding fields 336-338 and forgetting what may be truly 
useful to the patron.

--
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu



Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
But submit to whom? I think PCC oversaw the last revision of the NACO
Heading Comparison rules (formerly NACO normalization). LC manages the
DCM, which is closer to being an internal document than the LCRIs have
been, and less open to community input.  (DCM's instructions on using
pairs of 670s in the 670/General section would also need changing.)
Does JSC need to rule on what RDA means to say about undifferentiated
names before LC can make policy statement about them?

Regardless, this goes nowhere without LC and changes to the DCM. I've
tried to make the case with leaders there and have met with a counter
that presumes that undifferentiated authorities are used when the
persons, not their headings, can't be distinguished, which really
misunderstands how UndiffPNAs are structured and used. The DCM 670
instructions already make it clear that the persons on an UndiffPNA
are being distinguished from one another through the device of paired
670 fields. It's very frustrating.

Stephen

On Mon, Apr 25, 2011 at 2:57 PM, Mary Mastraccio  wrote:
>> My guess is there are other rules that I haven't spotted yet,
>> but these three--DCM Z1 008/32, NACO Heading Comparison, and
>> RDA/LCPS--would need to change to correct the current practice.
>
> The desire to have the UndifPNA practice/records changed has been expressed 
> repeatedly over the years. It seems to me that someone needs to step forward 
> to officially submit such a proposal. Can PCC, or similar group, be persuaded 
> to promote this change?
>
> Mary L. Mastraccio, MLS
> Cataloging & Authorities Librarian
> MARCIVE, Inc.
> San Antonio Texas 78265
> 1-800-531-7678
> ma...@marcive.com
> www.marcive.com
>



-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


[RDA-L] Forest for the trees syndrome II : RDA

2011-04-25 Thread Gene Fieg
OCLC record: *690085810: Fixed field for index should be marked as "1".
It also should have 504 stating that it contains bibliographical references
and index.

I guess we are too busy adding fields 336-338 and forgetting what may be
truly useful to the patron.

-- 
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu


Re: [RDA-L] Linked files

2011-04-25 Thread Mary Mastraccio
> My guess is there are other rules that I haven't spotted yet, 
> but these three--DCM Z1 008/32, NACO Heading Comparison, and 
> RDA/LCPS--would need to change to correct the current practice.

The desire to have the UndifPNA practice/records changed has been expressed 
repeatedly over the years. It seems to me that someone needs to step forward to 
officially submit such a proposal. Can PCC, or similar group, be persuaded to 
promote this change?

Mary L. Mastraccio, MLS
Cataloging & Authorities Librarian
MARCIVE, Inc.
San Antonio Texas 78265
1-800-531-7678
ma...@marcive.com
www.marcive.com 


Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
I've been trying to identify the linchpins in our documentation that
hold the sorry UndifPNA practice together. One is the DCM instruction
cited earlier. Another is the revised NACO Heading Comparison Rules
which forbid identical 100s. All AACR2 says is that identical
"headings" should be used in bib records when heading forms can't be
distinguished. It does not require that a single authority record be
created for persons with undifferentiated headings, and as John and
Diane point out, there's no need to do so. If differentiation is
managed elsewhere, the 100s (or more precisely, the encoded heading
texts) could be identical. The heading comparison rules could take
into account additional data not meant for display in the heading
text, like a difference between LCCNs.

There's more about managing undifferentiated names in RDA than there
ever was in AACR2. RDA instructs that the Undiff indicator must be
used when the core elements are not sufficient to distinguish two
names (e.g., RDA 8.3); but there may be room to argue that multiple
PNAs could carry the Undiff indicator to acknowledge that their 100s
are undifferentiated, without requiring that all the persons who share
an undifferentiated heading also share a single authority record.
Maybe that could be done with an LC Policy Statement. The point of the
008/32=b code would be to warn systems not to do automatic matching on
certain records' heading text strings, which is the practical value it
has now. VIAF and other smart systems avoids matches involving
UndiffPNAs. However, if the relationship between an authority record's
ID and the person it represents were fixed and consistent, then
systems using the LCCN identifier (or some synonymous ID) to match
between bib headings and authority records could safely link to an
UndiffPNA and thereby inherit any later changes to that person's
heading or authority record.

My guess is there are other rules that I haven't spotted yet, but
these three--DCM Z1 008/32, NACO Heading Comparison, and
RDA/LCPS--would need to change to correct the current practice.

Stephen

On Mon, Apr 25, 2011 at 1:23 PM, Diane I. Hillmann  wrote:
>  Just to point out a few things here:
>
> If we were not making the text of the name serve double duty, we would be
> providing an identifier to every newly established name, and the description
> would provide information on where that name appeared (a title page, for
> instance), which would thereby provide a distinction between it and another
> authority description based on a different resource, where the name that
> displayed was the same.  In this new world, there would NEVER be a need for
> an UndiffPNA (thanks, Stephen, for the unpronouncable shortened name for
> this!).  If we ultimately discovered that this John Smith really was the
> same as THAT John Smith, we could associate them, BUT NOT HAVE TO CHANGE THE
> IDENTIFIERS.
>
> Consider the amount of sheer human grunt work we could avoid (not to mention
> the actually bucks), with absolutely no loss of quality control, by moving
> on from our traditional practices.  And why can't we convince people that
> this is better, cheaper, and much more sensible?  ARRRGGH.
>
> Diane
>
> On 4/25/11 1:42 PM, Stephen Hearn wrote:
>>
>> Actually it doesn't remain the same. The current rules say that
>> identities can and should move on and off of an undifferentiated
>> personal name authority (UndiffPNA). When an UndiffPNA is reduced to
>> representing a single identity again, it is recoded as "unique"
>> (UniqPNA), until another person with the same name gets added to it,
>> it becomes a UndiffPNA again, and so on--all under the same LCCN. So,
>> over time, the rules will require that a single LCNAF authority record
>> represent a string of unique persons:
>>
>> UniqPNA Smith, John (1)
>>             Smith, John (2) appears, and cannot be given a unique heading
>> UndiffPNA Smith, John (1) and Smith, John (2)
>>             Smith, John (1) acquires a distinguishing bit of data and
>> is given a separate, new record.
>> UniqPNA Smith, John (2)
>>             Smith, John (3) appears, and cannot be given a unique heading
>> UndiffPNA Smith, John (2) and Smith, John (3)
>>             Smith, John (2) acquires a distinguishing bit of data and
>> is given a separate, new record.
>> UniqPNA  Smith, John (3)
>>
>> I agree with Jonathan that "persons" are slippery, UndiffPNAs are
>> pretty useless, and that they should never revert to UniqPNAs; but the
>> rules instruct us otherwise (specifically, LC's Descriptive Cataloging
>> Manual, Section Z1, 008/32, which NACO follows: "When an
>> undifferentiated personal name authority record is being revised to
>> delete all but one name, change value "b" to "a." ").
>>
>> Stephen
>>
>>
>>
>>
>> On Mon, Apr 25, 2011 at 11:55 AM, Jonathan Rochkind
>>  wrote:
>>>
>>> I'd interprett it differently, I'd say that an "undifferentiated name
>>> authority" always refers to the same thing -- a sort of fake person that
>>

Re: [RDA-L] Linked files

2011-04-25 Thread Diane I. Hillmann

 Just to point out a few things here:

If we were not making the text of the name serve double duty, we would 
be providing an identifier to every newly established name, and the 
description would provide information on where that name appeared (a 
title page, for instance), which would thereby provide a distinction 
between it and another authority description based on a different 
resource, where the name that displayed was the same.  In this new 
world, there would NEVER be a need for an UndiffPNA (thanks, Stephen, 
for the unpronouncable shortened name for this!).  If we ultimately 
discovered that this John Smith really was the same as THAT John Smith, 
we could associate them, BUT NOT HAVE TO CHANGE THE IDENTIFIERS.


Consider the amount of sheer human grunt work we could avoid (not to 
mention the actually bucks), with absolutely no loss of quality control, 
by moving on from our traditional practices.  And why can't we convince 
people that this is better, cheaper, and much more sensible?  ARRRGGH.


Diane

On 4/25/11 1:42 PM, Stephen Hearn wrote:

Actually it doesn't remain the same. The current rules say that
identities can and should move on and off of an undifferentiated
personal name authority (UndiffPNA). When an UndiffPNA is reduced to
representing a single identity again, it is recoded as "unique"
(UniqPNA), until another person with the same name gets added to it,
it becomes a UndiffPNA again, and so on--all under the same LCCN. So,
over time, the rules will require that a single LCNAF authority record
represent a string of unique persons:

UniqPNA Smith, John (1)
 Smith, John (2) appears, and cannot be given a unique heading
UndiffPNA Smith, John (1) and Smith, John (2)
 Smith, John (1) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA Smith, John (2)
 Smith, John (3) appears, and cannot be given a unique heading
UndiffPNA Smith, John (2) and Smith, John (3)
 Smith, John (2) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA  Smith, John (3)

I agree with Jonathan that "persons" are slippery, UndiffPNAs are
pretty useless, and that they should never revert to UniqPNAs; but the
rules instruct us otherwise (specifically, LC's Descriptive Cataloging
Manual, Section Z1, 008/32, which NACO follows: "When an
undifferentiated personal name authority record is being revised to
delete all but one name, change value "b" to "a." ").

Stephen




On Mon, Apr 25, 2011 at 11:55 AM, Jonathan Rochkind  wrote:

I'd interprett it differently, I'd say that an "undifferentiated name
authority" always refers to the same thing -- a sort of fake person that
isn't really a known person at all. But this remains the same, it's just the
way  it is.


Re: [RDA-L] Linked files

2011-04-25 Thread Lasater, Mary Charles
WHY does it take a new set of rules to make this change? If a name has been 
undifferentiated under AACR2, there is every reason to believe we will need 
that undifferentiated heading again under AACR2, even if the last 670 is 
removed.  I have been 'begging' for a workable change to undifferentiated 
personal names for years and years and years. Isn't this something that could 
be implemented NOW and not wait any longer? 

Mary Charles Lasater
Authorities Coordinator
Vanderbilt University

-Original Message-
From: Resource Description and Access / Resource Description and Access 
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Jonathan Rochkind
Sent: Monday, April 25, 2011 12:54 PM
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Linked files

On 4/25/2011 1:42 PM, Stephen Hearn wrote:
> Actually it doesn't remain the same. The current rules say that
> identities can and should move on and off of an undifferentiated
> personal name authority (UndiffPNA). When an UndiffPNA is reduced to
> representing a single identity again, it is recoded as "unique"
> (UniqPNA), until another person with the same name gets added to it,

Yep, I agree that's a mistake, for good identifier management.  The 
UndiffPNA should ideally remain an UndiffPNA, never magically change to 
a UniqPNA and then back again.


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind

On 4/25/2011 1:42 PM, Stephen Hearn wrote:

Actually it doesn't remain the same. The current rules say that
identities can and should move on and off of an undifferentiated
personal name authority (UndiffPNA). When an UndiffPNA is reduced to
representing a single identity again, it is recoded as "unique"
(UniqPNA), until another person with the same name gets added to it,


Yep, I agree that's a mistake, for good identifier management.  The 
UndiffPNA should ideally remain an UndiffPNA, never magically change to 
a UniqPNA and then back again.


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind
Yep, that's exactly why using URI's has become conventional, you've got 
it actually.


Instead of just using "1234567" as an identifier for an authority file, 
running into the problems you talk about, you use something like:


http://id.loc.gov/subjects/12345678

Or whatever. This is in fact exactly the key to using URI's as 
identifiers, taking what would be unique just within ONE 
system/database, and making it truly universally unique.  
http://id.loc/gov/subjects/1234567 as opposed to 
http://some.other.vendor.com/12345678


That use of URI's as identifiers is actually primary to them being 
resolvable at all, it's just using them as a convenient way to make a 
universally unique identifier, LC is in control of "id.loc.gov", so they 
can use it as a namespace/prefix for their identifiers.  But putting it 
in a URI, you have combined in one string "identifier", the number and 
"where it comes from" into a universally unique identifier amongst all 
other URI identifiers.


Ideally LC and our other authority agencies would establish these URIs.  
Which LC is slowly moving toward with id.loc.gov and such. But yeah, 
then we should use those URIs as our identifiers in ALL of our systems, 
not just the bare "12345678", precisely because it's preparing for 
combining identifiers from different "authority files" in the same system.


Make sense?

On 4/25/2011 1:13 PM, Adger Williams wrote:
I see where we're going here, but it may not be quite as bad as you 
think.  Our monthly updates from Marcive are indeed based on 010, not 
based on unstable character strings and I guess others' are also.  I 
hadn't reckoned with authority numbers in bib records, but, (like 
Mac's long-lamented UTLAS).


I do have to ask though...
If we have bzillions of links to different authority files, do we 
really require that all these links be unique?  (We are going to run 
into some very big numbers very fast if we can't re-use them.)  That 
would mean each link would be a number AND an indication of which 
authority file (registry, or what have you) that number was valid for, 
wouldn't it?


On Mon, Apr 25, 2011 at 12:55 PM, Jonathan Rochkind > wrote:


I'd interprett it differently, I'd say that an "undifferentiated
name authority" always refers to the same thing -- a sort of fake
person that isn't really a known person at all. But this remains
the same, it's just the way  it is.

It turns out that "person" is a slippery concept in the first
place.  For instance, some people might define "Mark Twain" as a
seperate person from "Samuel Clemens", and some might say those
are the same person. It doesn't define any fundamental rule of
identifiers to call them seperate "people" -- as long as we're
clear we're using a special technical meaning of "person", a sort
of "bibliographic person" in our database. This is totally fine,
we can do what makes sense for our domain.

Likewise, we're defining a sort of special "bibliographic person"
in the case of an "undifferentiated name authority".  It's a sort
of fake person where we aren't sure what the person is at all.
(It's NEVER been clear to me what the purpose of this is -- why
have an authority at all for an "undifferentiated name"? Why not
just leave it out of the authorities entirely?  But if it serves a
purpose for us, I don't see a huge problem in doing it).

But here's the important thing -- when you later seperate out the
actual people from that former "undifferentiated name authority"
-- it's important you create NEW identifiers for those new
"differentiated" authorities, and NOT re-use the identifier you
used for the "undifferentiated name authority."  Then the "thing
identified" has _not_ changed -- the identifier for the
undifferentiated name authority _still_ identifies the same thing
-- the weird "bibliographic person" that is undifferentiated.
 This is fine. I mean, it may or may not be _convenient_, but it's
not fundamentally unsound.

Jonathan


On 4/25/2011 12:49 PM, Stephen Hearn wrote:

Another fundamental rule of identifiers is that what is identified
should not change significantly. That generally holds true in LC
authority practice, but not in the case of undifferentiated
personal
name authorities. By rule and standard procedure, an LCCN for an
authority of this kind can refer uniquely to one person now, and a
different person later, and yet another person later still.

This is another reason why a system which restricts the data
elements
that can be used to make a unique heading to the point that
making a
unique heading is not always possible is problematic. Good
identifiers
can always be made unique to correspond to a unique entity.
LC/NACO
personal name headings can't, which is another re

Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
Actually it doesn't remain the same. The current rules say that
identities can and should move on and off of an undifferentiated
personal name authority (UndiffPNA). When an UndiffPNA is reduced to
representing a single identity again, it is recoded as "unique"
(UniqPNA), until another person with the same name gets added to it,
it becomes a UndiffPNA again, and so on--all under the same LCCN. So,
over time, the rules will require that a single LCNAF authority record
represent a string of unique persons:

UniqPNA Smith, John (1)
Smith, John (2) appears, and cannot be given a unique heading
UndiffPNA Smith, John (1) and Smith, John (2)
Smith, John (1) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA Smith, John (2)
Smith, John (3) appears, and cannot be given a unique heading
UndiffPNA Smith, John (2) and Smith, John (3)
Smith, John (2) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA  Smith, John (3)

I agree with Jonathan that "persons" are slippery, UndiffPNAs are
pretty useless, and that they should never revert to UniqPNAs; but the
rules instruct us otherwise (specifically, LC's Descriptive Cataloging
Manual, Section Z1, 008/32, which NACO follows: "When an
undifferentiated personal name authority record is being revised to
delete all but one name, change value "b" to "a." ").

Stephen




On Mon, Apr 25, 2011 at 11:55 AM, Jonathan Rochkind  wrote:
> I'd interprett it differently, I'd say that an "undifferentiated name
> authority" always refers to the same thing -- a sort of fake person that
> isn't really a known person at all. But this remains the same, it's just the
> way  it is.

-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Linked files

2011-04-25 Thread Adger Williams
I see where we're going here, but it may not be quite as bad as you think.
Our monthly updates from Marcive are indeed based on 010, not based on
unstable character strings and I guess others' are also.  I hadn't reckoned
with authority numbers in bib records, but, (like Mac's long-lamented
UTLAS).

I do have to ask though...
If we have bzillions of links to different authority files, do we really
require that all these links be unique?  (We are going to run into some very
big numbers very fast if we can't re-use them.)  That would mean each link
would be a number AND an indication of which authority file (registry, or
what have you) that number was valid for, wouldn't it?

On Mon, Apr 25, 2011 at 12:55 PM, Jonathan Rochkind wrote:

> I'd interprett it differently, I'd say that an "undifferentiated name
> authority" always refers to the same thing -- a sort of fake person that
> isn't really a known person at all. But this remains the same, it's just the
> way  it is.
>
> It turns out that "person" is a slippery concept in the first place.  For
> instance, some people might define "Mark Twain" as a seperate person from
> "Samuel Clemens", and some might say those are the same person. It doesn't
> define any fundamental rule of identifiers to call them seperate "people" --
> as long as we're clear we're using a special technical meaning of "person",
> a sort of "bibliographic person" in our database. This is totally fine, we
> can do what makes sense for our domain.
>
> Likewise, we're defining a sort of special "bibliographic person" in the
> case of an "undifferentiated name authority".  It's a sort of fake person
> where we aren't sure what the person is at all. (It's NEVER been clear to me
> what the purpose of this is -- why have an authority at all for an
> "undifferentiated name"? Why not just leave it out of the authorities
> entirely?  But if it serves a purpose for us, I don't see a huge problem in
> doing it).
>
> But here's the important thing -- when you later seperate out the actual
> people from that former "undifferentiated name authority" -- it's important
> you create NEW identifiers for those new "differentiated" authorities, and
> NOT re-use the identifier you used for the "undifferentiated name
> authority."  Then the "thing identified" has _not_ changed -- the identifier
> for the undifferentiated name authority _still_ identifies the same thing --
> the weird "bibliographic person" that is undifferentiated.  This is fine. I
> mean, it may or may not be _convenient_, but it's not fundamentally unsound.
>
> Jonathan
>
>
> On 4/25/2011 12:49 PM, Stephen Hearn wrote:
>
>> Another fundamental rule of identifiers is that what is identified
>> should not change significantly. That generally holds true in LC
>> authority practice, but not in the case of undifferentiated personal
>> name authorities. By rule and standard procedure, an LCCN for an
>> authority of this kind can refer uniquely to one person now, and a
>> different person later, and yet another person later still.
>>
>> This is another reason why a system which restricts the data elements
>> that can be used to make a unique heading to the point that making a
>> unique heading is not always possible is problematic. Good identifiers
>> can always be made unique to correspond to a unique entity. LC/NACO
>> personal name headings can't, which is another reason why they're not
>> good identifiers under the current rules.
>>
>> Stephen
>>
>> On Mon, Apr 25, 2011 at 11:20 AM, Jonathan Rochkind
>>  wrote:
>>
>>> I am talking about our library-community database as "the database
>>> [someone]
>>> is linking to."
>>>
>>> If we're always changing our identifiers (considering our authority 1xx
>>> "preferred display forms" to be identifiers), that makes it very hard for
>>> anyone to link to things in our database.
>>>
>>> Even just for our own database with their internal links, always changing
>>> the effective 'identifiers' (auth 1xx) makes our own housekeeping much
>>> more
>>> expensive for ourselves.
>>>
>>> Again, this is because of using the very same string (auth 1xx) as both a
>>> functional "identifier" and a functional "preferred display term".  A
>>> practice that is highly discouraged in actual contemporary
>>> software/metadata
>>> engineering, although it worked fine 100 years ago.
>>>
>>> Seriously, it is a fundamental idea in identifier management, decades
>>> old,
>>> that you should not change your identifiers, and for this reason you
>>> should
>>> not use strings you will be displaying to users as identifiers. One way
>>> this
>>> idea is expressed, for instance, is that you should not use a 'natural
>>> key'
>>> as a 'primary key' in a relational database. You can google on those
>>> terms
>>> if you want. In the sense that an rdbms pk serves as a kind of
>>> identifier,
>>> that is just one expression of the fundamental guideline not to change
>>> your
>>> identifiers, and thus not to use things you might want to 

Re: [RDA-L] Linked files

2011-04-25 Thread John Hostage
Undifferentiated personal name authority records exist only because we use the 
preferred label (the heading) as our identifiers.  They made sense in a card 
file, but not in a computer-based system.  It's surprising that RDA carried 
them over.  We should be creating a separate record for each bibliographic 
identity that we determine, even if they have the same name.  The identifier 
should be something unique and unchanging, like the LCCN.  Adding more things 
to the headings to make them unique is not the solution, because that would 
still be treating the heading as the identifier.

--
John Hostage
Authorities and Database Integrity Librarian
Langdell Hall
Harvard Law School Library
Cambridge, MA 02138
host...@law.harvard.edu
+(1)(617) 495-3974 (voice)
+(1)(617) 496-4409 (fax)
http://www.law.harvard.edu/library/

-Original Message-
From: Resource Description and Access / Resource Description and Access 
[mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of Stephen Hearn
Sent: Monday, April 25, 2011 12:49
To: RDA-L@LISTSERV.LAC-BAC.GC.CA
Subject: Re: [RDA-L] Linked files

Another fundamental rule of identifiers is that what is identified
should not change significantly. That generally holds true in LC
authority practice, but not in the case of undifferentiated personal
name authorities. By rule and standard procedure, an LCCN for an
authority of this kind can refer uniquely to one person now, and a
different person later, and yet another person later still.

This is another reason why a system which restricts the data elements
that can be used to make a unique heading to the point that making a
unique heading is not always possible is problematic. Good identifiers
can always be made unique to correspond to a unique entity. LC/NACO
personal name headings can't, which is another reason why they're not
good identifiers under the current rules.


Re: [RDA-L] Linked files

2011-04-25 Thread James Weinheimer

On 04/25/2011 06:20 PM, Jonathan Rochkind wrote:

Seriously, it is a fundamental idea in identifier management, decades 
old, that you should not change your identifiers, and for this reason 
you should not use strings you will be displaying to users as 
identifiers. One way this idea is expressed, for instance, is that you 
should not use a 'natural key' as a 'primary key' in a relational 
database. You can google on those terms if you want. In the sense that 
an rdbms pk serves as a kind of identifier, that is just one 
expression of the fundamental guideline not to change your 
identifiers, and thus not to use things you might want to change as 
identifiers.


I am seriously not sure why you are arguing this, James.  This is a 
pretty fundamental concept of data design accepted by every single 
contemporary era data/database/metadata designer. This is probably my 
last post in this thread, this is getting frustrating to me.  Perhaps 
it's my fault in not being able to explain this concept adequately, in 
which case I don't think I can personally do any better then I've 
done.  Otherwise, I am not sure why you are insisting on arguing with 
a basic principle accepted by everyone else doing computer-era 
data/database/metadata design -- which has been proven in practice to 
be a really good prinicple. It's not a controversial principle.  At 
all.  Anywhere except among library catalogers, apparently.



The reason I am arguing this point is that it is something that can be 
done now, relatively inexpensively and otherwise, *nothing gets done at 
all*. For example, all this discussion about RDA and how it promises the 
New Atlantis, and so on, but for the foreseeable future, the public will 
notice absolutely nothing.


I confess that these kinds of discussions get frustrating for me as 
well. Instituting URIs would be a library tool that could be used by the 
entire community and who may actually find them useful--perhaps 
extremely useful. All the powers-that-be would have to do is ensure 
unambiguous access to current and earlier forms of headings. I *know* 
that that could be done easily enough, and I'm sure you do too. But no, 
everybody in the world has to be expected to add and change their 
headings to the identifiers of the powers-that-be, because otherwise 
things don't conform to the way they are supposed to work. How much 
incredible labor and expense does this entail? It is simply unrealistic 
to expect this to happen in the current situation and possible future, 
so the consequence is: nothing will get done. And who gets hurt? The 
patrons, and by extension, us, because we are seen as dinosaurs.


Of course I understand how identifiers are supposed to work, but *I 
don't care* how the system is "supposed" to work. I am by training a 
historian, and when I see that something "should never change" I just 
smile. Of course things will change and this must be built into *any 
system*, otherwise it is guaranteed to fail.


Right now, we need something that functions and that will make a 
substantive difference to our patrons. The cataloging profession sorely 
needs some successes, and that means coming up with creative solutions 
that people will see and--hopefully--appreciate. 70% or 80% today is 
certainly better than what we have now. It is frustrating to see some 
solutions, even temporary ones, and not have them.


It's a h*** of a way to run a business!

--
James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind
I'd interprett it differently, I'd say that an "undifferentiated name 
authority" always refers to the same thing -- a sort of fake person that 
isn't really a known person at all. But this remains the same, it's just 
the way  it is.


It turns out that "person" is a slippery concept in the first place.  
For instance, some people might define "Mark Twain" as a seperate person 
from "Samuel Clemens", and some might say those are the same person. It 
doesn't define any fundamental rule of identifiers to call them seperate 
"people" -- as long as we're clear we're using a special technical 
meaning of "person", a sort of "bibliographic person" in our database. 
This is totally fine, we can do what makes sense for our domain.


Likewise, we're defining a sort of special "bibliographic person" in the 
case of an "undifferentiated name authority".  It's a sort of fake 
person where we aren't sure what the person is at all. (It's NEVER been 
clear to me what the purpose of this is -- why have an authority at all 
for an "undifferentiated name"? Why not just leave it out of the 
authorities entirely?  But if it serves a purpose for us, I don't see a 
huge problem in doing it).


But here's the important thing -- when you later seperate out the actual 
people from that former "undifferentiated name authority" -- it's 
important you create NEW identifiers for those new "differentiated" 
authorities, and NOT re-use the identifier you used for the 
"undifferentiated name authority."  Then the "thing identified" has 
_not_ changed -- the identifier for the undifferentiated name authority 
_still_ identifies the same thing -- the weird "bibliographic person" 
that is undifferentiated.  This is fine. I mean, it may or may not be 
_convenient_, but it's not fundamentally unsound.


Jonathan

On 4/25/2011 12:49 PM, Stephen Hearn wrote:

Another fundamental rule of identifiers is that what is identified
should not change significantly. That generally holds true in LC
authority practice, but not in the case of undifferentiated personal
name authorities. By rule and standard procedure, an LCCN for an
authority of this kind can refer uniquely to one person now, and a
different person later, and yet another person later still.

This is another reason why a system which restricts the data elements
that can be used to make a unique heading to the point that making a
unique heading is not always possible is problematic. Good identifiers
can always be made unique to correspond to a unique entity. LC/NACO
personal name headings can't, which is another reason why they're not
good identifiers under the current rules.

Stephen

On Mon, Apr 25, 2011 at 11:20 AM, Jonathan Rochkind  wrote:

I am talking about our library-community database as "the database [someone]
is linking to."

If we're always changing our identifiers (considering our authority 1xx
"preferred display forms" to be identifiers), that makes it very hard for
anyone to link to things in our database.

Even just for our own database with their internal links, always changing
the effective 'identifiers' (auth 1xx) makes our own housekeeping much more
expensive for ourselves.

Again, this is because of using the very same string (auth 1xx) as both a
functional "identifier" and a functional "preferred display term".  A
practice that is highly discouraged in actual contemporary software/metadata
engineering, although it worked fine 100 years ago.

Seriously, it is a fundamental idea in identifier management, decades old,
that you should not change your identifiers, and for this reason you should
not use strings you will be displaying to users as identifiers. One way this
idea is expressed, for instance, is that you should not use a 'natural key'
as a 'primary key' in a relational database. You can google on those terms
if you want. In the sense that an rdbms pk serves as a kind of identifier,
that is just one expression of the fundamental guideline not to change your
identifiers, and thus not to use things you might want to change as
identifiers.

I am seriously not sure why you are arguing this, James.  This is a pretty
fundamental concept of data design accepted by every single contemporary era
data/database/metadata designer. This is probably my last post in this
thread, this is getting frustrating to me.  Perhaps it's my fault in not
being able to explain this concept adequately, in which case I don't think I
can personally do any better then I've done.  Otherwise, I am not sure why
you are insisting on arguing with a basic principle accepted by everyone
else doing computer-era data/database/metadata design -- which has been
proven in practice to be a really good prinicple. It's not a controversial
principle.  At all.  Anywhere except among library catalogers, apparently.

Jonathan

On 4/25/2011 12:12 PM, James Weinheimer wrote:

On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:


If you maintain the "preferred display form" as your _identifier_, then
whenever the preferr

Re: [RDA-L] Linked files

2011-04-25 Thread Mark Ehlert
Jonathan Rochkind  wrote:
> This [database linkages and identifiers] is a pretty
> fundamental concept of data design accepted by every single contemporary era
> data/database/metadata designer. ... It's not a controversial
> principle.  At all.  Anywhere except among library catalogers, apparently.

This cataloger knows to a limited extent what you're talking about and
understands its utility, though I'm no network administrator nor
database manager.  At the same time, please don't tar all of us with
the same brush.

-- 
Mark K. Ehlert                 Minitex
Coordinator                    University of Minnesota
Bibliographic & Technical      15 Andersen Library
  Services (BATS) Unit        222 21st Avenue South
Phone: 612-624-0805            Minneapolis, MN 55455-0439



Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
Another fundamental rule of identifiers is that what is identified
should not change significantly. That generally holds true in LC
authority practice, but not in the case of undifferentiated personal
name authorities. By rule and standard procedure, an LCCN for an
authority of this kind can refer uniquely to one person now, and a
different person later, and yet another person later still.

This is another reason why a system which restricts the data elements
that can be used to make a unique heading to the point that making a
unique heading is not always possible is problematic. Good identifiers
can always be made unique to correspond to a unique entity. LC/NACO
personal name headings can't, which is another reason why they're not
good identifiers under the current rules.

Stephen

On Mon, Apr 25, 2011 at 11:20 AM, Jonathan Rochkind  wrote:
> I am talking about our library-community database as "the database [someone]
> is linking to."
>
> If we're always changing our identifiers (considering our authority 1xx
> "preferred display forms" to be identifiers), that makes it very hard for
> anyone to link to things in our database.
>
> Even just for our own database with their internal links, always changing
> the effective 'identifiers' (auth 1xx) makes our own housekeeping much more
> expensive for ourselves.
>
> Again, this is because of using the very same string (auth 1xx) as both a
> functional "identifier" and a functional "preferred display term".  A
> practice that is highly discouraged in actual contemporary software/metadata
> engineering, although it worked fine 100 years ago.
>
> Seriously, it is a fundamental idea in identifier management, decades old,
> that you should not change your identifiers, and for this reason you should
> not use strings you will be displaying to users as identifiers. One way this
> idea is expressed, for instance, is that you should not use a 'natural key'
> as a 'primary key' in a relational database. You can google on those terms
> if you want. In the sense that an rdbms pk serves as a kind of identifier,
> that is just one expression of the fundamental guideline not to change your
> identifiers, and thus not to use things you might want to change as
> identifiers.
>
> I am seriously not sure why you are arguing this, James.  This is a pretty
> fundamental concept of data design accepted by every single contemporary era
> data/database/metadata designer. This is probably my last post in this
> thread, this is getting frustrating to me.  Perhaps it's my fault in not
> being able to explain this concept adequately, in which case I don't think I
> can personally do any better then I've done.  Otherwise, I am not sure why
> you are insisting on arguing with a basic principle accepted by everyone
> else doing computer-era data/database/metadata design -- which has been
> proven in practice to be a really good prinicple. It's not a controversial
> principle.  At all.  Anywhere except among library catalogers, apparently.
>
> Jonathan
>
> On 4/25/2011 12:12 PM, James Weinheimer wrote:
>>
>> On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:
>> 
>>>
>>> If you maintain the "preferred display form" as your _identifier_, then
>>> whenever the preferred display form changes, all those links will need to be
>>> changed.
>>>
>>> This is why contemporary computer-era identifier practice does NOT use
>>> "preferred display form" as an identifier. Because preferred display forms
>>> change, but identifiers ought not to. The identifier should be a
>>> _persistent_ link into your database for the identified record.
>>
>> 
>>
>> So long as the link from your database links unambiguously to the resource
>> you want to link to, that is all that matters. There are different ways of
>> allowing that. This function is most efficiently handled by the database you
>> are linking into, instead of the single database expecting everybody in the
>> world to change their own databases to add their URIs. For example, I could
>> add a link for the NAF form of Leo Tolstoy to dbpedia to interoperate with
>> it. If they had a special search for exact NAF form, like in the VIAF, it
>> would definitely be unambiguous.
>>
>> My point is: this is something that is achievable. Probably through a
>> relatively simple API, it could be implemented in every catalog pretty
>> easily. There is just no hope that each catalog will add URIs within any
>> reasonable amount of time.
>>
>> Certainly, if we were creating things from scratch, we could redo
>> everything that would be better for us (there is no doubt in my mind that
>> future information specialists/catalogers 80 years from now will be
>> complaining about whatever we make), but you must play the cards you are
>> dealt and be creative with what you have.  Perhaps it wouldn't be perfect,
>> or maybe it would, I don't know, but in any case, it would be vastly better
>> than what we have now and people could start discovering and using our
>> records in new wa

Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind
I am talking about our library-community database as "the database 
[someone] is linking to."


If we're always changing our identifiers (considering our authority 1xx 
"preferred display forms" to be identifiers), that makes it very hard 
for anyone to link to things in our database.


Even just for our own database with their internal links, always 
changing the effective 'identifiers' (auth 1xx) makes our own 
housekeeping much more expensive for ourselves.


Again, this is because of using the very same string (auth 1xx) as both 
a functional "identifier" and a functional "preferred display term".  A 
practice that is highly discouraged in actual contemporary 
software/metadata engineering, although it worked fine 100 years ago.


Seriously, it is a fundamental idea in identifier management, decades 
old, that you should not change your identifiers, and for this reason 
you should not use strings you will be displaying to users as 
identifiers. One way this idea is expressed, for instance, is that you 
should not use a 'natural key' as a 'primary key' in a relational 
database. You can google on those terms if you want. In the sense that 
an rdbms pk serves as a kind of identifier, that is just one expression 
of the fundamental guideline not to change your identifiers, and thus 
not to use things you might want to change as identifiers.


I am seriously not sure why you are arguing this, James.  This is a 
pretty fundamental concept of data design accepted by every single 
contemporary era data/database/metadata designer. This is probably my 
last post in this thread, this is getting frustrating to me.  Perhaps 
it's my fault in not being able to explain this concept adequately, in 
which case I don't think I can personally do any better then I've done.  
Otherwise, I am not sure why you are insisting on arguing with a basic 
principle accepted by everyone else doing computer-era 
data/database/metadata design -- which has been proven in practice to be 
a really good prinicple. It's not a controversial principle.  At all.  
Anywhere except among library catalogers, apparently.


Jonathan

On 4/25/2011 12:12 PM, James Weinheimer wrote:

On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:

If you maintain the "preferred display form" as your _identifier_, 
then whenever the preferred display form changes, all those links 
will need to be changed.


This is why contemporary computer-era identifier practice does NOT 
use "preferred display form" as an identifier. Because preferred 
display forms change, but identifiers ought not to. The identifier 
should be a _persistent_ link into your database for the identified 
record.



So long as the link from your database links unambiguously to the 
resource you want to link to, that is all that matters. There are 
different ways of allowing that. This function is most efficiently 
handled by the database you are linking into, instead of the single 
database expecting everybody in the world to change their own 
databases to add their URIs. For example, I could add a link for the 
NAF form of Leo Tolstoy to dbpedia to interoperate with it. If they 
had a special search for exact NAF form, like in the VIAF, it would 
definitely be unambiguous.


My point is: this is something that is achievable. Probably through a 
relatively simple API, it could be implemented in every catalog pretty 
easily. There is just no hope that each catalog will add URIs within 
any reasonable amount of time.


Certainly, if we were creating things from scratch, we could redo 
everything that would be better for us (there is no doubt in my mind 
that future information specialists/catalogers 80 years from now will 
be complaining about whatever we make), but you must play the cards 
you are dealt and be creative with what you have.  Perhaps it wouldn't 
be perfect, or maybe it would, I don't know, but in any case, it would 
be vastly better than what we have now and people could start 
discovering and using our records in new ways.




Re: [RDA-L] Linked files

2011-04-25 Thread James Weinheimer

On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:

If you maintain the "preferred display form" as your _identifier_, 
then whenever the preferred display form changes, all those links will 
need to be changed.


This is why contemporary computer-era identifier practice does NOT use 
"preferred display form" as an identifier. Because preferred display 
forms change, but identifiers ought not to. The identifier should be a 
_persistent_ link into your database for the identified record.



So long as the link from your database links unambiguously to the 
resource you want to link to, that is all that matters. There are 
different ways of allowing that. This function is most efficiently 
handled by the database you are linking into, instead of the single 
database expecting everybody in the world to change their own databases 
to add their URIs. For example, I could add a link for the NAF form of 
Leo Tolstoy to dbpedia to interoperate with it. If they had a special 
search for exact NAF form, like in the VIAF, it would definitely be 
unambiguous.


My point is: this is something that is achievable. Probably through a 
relatively simple API, it could be implemented in every catalog pretty 
easily. There is just no hope that each catalog will add URIs within any 
reasonable amount of time.


Certainly, if we were creating things from scratch, we could redo 
everything that would be better for us (there is no doubt in my mind 
that future information specialists/catalogers 80 years from now will be 
complaining about whatever we make), but you must play the cards you are 
dealt and be creative with what you have.  Perhaps it wouldn't be 
perfect, or maybe it would, I don't know, but in any case, it would be 
vastly better than what we have now and people could start discovering 
and using our records in new ways.


--
James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


[RDA-L] Forest for the trees syndrome (aka RDA)

2011-04-25 Thread Gene Fieg
Book in hand: God's empire / Hilary M. Carey.

OCLC record: *656771606

I won't change the record in OCLC, but for our library here, we will
transform it back to AACR2.
While we are spelling out pages and illustrations, etc., I guess we forgot
to include the information that the book includes maps.  That tidbit of info
could be important to an historian reading this work, but such info is
neither in the fixed field nor in the variable one.

So, at least for this record, we will use "ill.", and "p."  When and if RDA
is approved, I hope we will not get in the habit of leaving out important
bits of info that the patron might want to know about.

-- 
Gene Fieg
Cataloger/Serials Librarian
Claremont School of Theology
gf...@cst.edu


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind
If you maintain the "preferred display form" as your _identifier_, then 
whenever the preferred display form changes, all those links will need 
to be changed.


This is why contemporary computer-era identifier practice does NOT use 
"preferred display form" as an identifier. Because preferred display 
forms change, but identifiers ought not to. The identifier should be a 
_persistent_ link into your database for the identified record.


On 4/25/2011 11:39 AM, James Weinheimer wrote:

On 04/25/2011 04:27 PM, Jonathan Rochkind wrote:

I agree entirely, controlled headings from authority files ARE a sort 
of archaic version of identifiers and should be considered as such.


The thing is, that they aren't all that succesful as identifiers in 
the modern environment.  For instance, just as the most obvious 
example, you NEVER want to _change_ an identifier.  Yet, our 
authority file headings sometimes get changed (from a rename of an 
LCSH heading, to adding a death date to an author).  Violates pretty 
much the first most basic rule of modern identifiers.


It's no surprise that an identifier system our community invented 
nearly a hundred years ago before computers really existed do not 
perform very well as identifiers in the present environment. But it's 
still the truth. I think you're absolutely right that we should 
understand these legacy controlled headings as a sort of identifier 
-- that will help us understand better how to use them and convert 
them in the modern environment. But important to remember they are a 
sort of ancient identifier system, which is ill-suited in several 
ways for the contemporary environment.



So long as the URI links unambiguously to the correct concept, it 
should not matter. In the new environment, it only makes sense that 
one conceptual resource could have many URIs. With the VIAF for 
example, we see how each heading is unambiguously linked in a variety 
of ways based on their own forms, In a correct system, the label that 
people see will be controlled by the searcher him or herself.


To see how it could work is to look at dbpedia for Leo Tolstoy 
http://dbpedia.org/page/Leo_Tolstoy with all of the redirects. That is 
the dbpedia URI. So long as our superceded forms are handled in an 
unambiguous way, is they are now (with very few exceptions I think, 
essentially for forms that take on qualifiers, but I am sure these 
could be handled unambiguously too), the system should still work.


I think it is important to get something to demonstrate ASAP. If we 
expect that everyone is supposed to add URIs in their own databases, 
then that will take a very, very long time and is not realistic. It 
will not happen, or at any rate, by the time it does happen, the 
public will have moved even farther away from anything we make. Doing 
something now that is "quick and dirty" (and not so dirty, I suspect), 
plus relatively inexpensively, to provide the public with something 
that they may actually find useful, even though it may not be perfect 
and need some kind of updating, would certainly be far more practical 
than expecting the public to just wait until everybody adds the URIs.


Because it is clear that people will move on, farther away from us, 
and just ignore our tools.




Re: [RDA-L] Linked files

2011-04-25 Thread James Weinheimer

On 04/25/2011 04:27 PM, Jonathan Rochkind wrote:

I agree entirely, controlled headings from authority files ARE a sort 
of archaic version of identifiers and should be considered as such.


The thing is, that they aren't all that succesful as identifiers in 
the modern environment.  For instance, just as the most obvious 
example, you NEVER want to _change_ an identifier.  Yet, our authority 
file headings sometimes get changed (from a rename of an LCSH heading, 
to adding a death date to an author).  Violates pretty much the first 
most basic rule of modern identifiers.


It's no surprise that an identifier system our community invented 
nearly a hundred years ago before computers really existed do not 
perform very well as identifiers in the present environment. But it's 
still the truth. I think you're absolutely right that we should 
understand these legacy controlled headings as a sort of identifier -- 
that will help us understand better how to use them and convert them 
in the modern environment. But important to remember they are a sort 
of ancient identifier system, which is ill-suited in several ways for 
the contemporary environment.



So long as the URI links unambiguously to the correct concept, it should 
not matter. In the new environment, it only makes sense that one 
conceptual resource could have many URIs. With the VIAF for example, we 
see how each heading is unambiguously linked in a variety of ways based 
on their own forms, In a correct system, the label that people see will 
be controlled by the searcher him or herself.


To see how it could work is to look at dbpedia for Leo Tolstoy 
http://dbpedia.org/page/Leo_Tolstoy with all of the redirects. That is 
the dbpedia URI. So long as our superceded forms are handled in an 
unambiguous way, is they are now (with very few exceptions I think, 
essentially for forms that take on qualifiers, but I am sure these could 
be handled unambiguously too), the system should still work.


I think it is important to get something to demonstrate ASAP. If we 
expect that everyone is supposed to add URIs in their own databases, 
then that will take a very, very long time and is not realistic. It will 
not happen, or at any rate, by the time it does happen, the public will 
have moved even farther away from anything we make. Doing something now 
that is "quick and dirty" (and not so dirty, I suspect), plus relatively 
inexpensively, to provide the public with something that they may 
actually find useful, even though it may not be perfect and need some 
kind of updating, would certainly be far more practical than expecting 
the public to just wait until everybody adds the URIs.


Because it is clear that people will move on, farther away from us, and 
just ignore our tools.


--
James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind
"We've got identifiers" -- do you mean the 1xx "preferred display forms" 
are identifiers?  That is what James was suggesting, and I was agreeing 
that they were a pre-computer-era form of identifier, and it is helpful 
to understand them as being that.


But they aren't good identifiers, at least for the computer era (which 
began decades ago) precisely because they can change -- I agree that as 
"preferred display forms" they need to be able to change. But as 
"identifiers" they ought not to change. Which is why the 
pre-computer-era approach of using the same string as both a "preferred 
display form" AND an "identifier" is no longer sufficient.  In the 
computer era, nobody ever (or very very rarely) uses the same string as 
both an "identifier" and a "preferred display form" for this reason.


Jonathan

On 4/25/2011 11:09 AM, Adger Williams wrote:

I don't think I understand.
We've got identifiers.

We all do our authority updates by authority record numbers, which (by 
and large) don't change.


We do change 1xx forms, which one should perhaps think of as 
"preferred display forms", and I think it would be unwise to think the 
desire to change preferred display forms will go away.


So, I'm not sure what the "new" part of the new world of linked data 
would be here.


On Mon, Apr 25, 2011 at 10:27 AM, Jonathan Rochkind > wrote:


On 4/22/2011 1:13 PM, James Weinheimer wrote:


There is another way of looking at our headings than solely as
textual strings, which is not entirely correct, but rather as
identifying something *unambiguously*. This is exactly what
our headings are designed to do. An identifier does not have
to be composed only of numbers, but any string. This is why I
have suggested reconsidering our headings *as* identifiers,
since catalogers have worked very, very hard for a long time
to keep them unique, or unambiguous.


I agree entirely, controlled headings from authority files ARE a
sort of archaic version of identifiers and should be considered as
such.

The thing is, that they aren't all that succesful as identifiers
in the modern environment.  For instance, just as the most obvious
example, you NEVER want to _change_ an identifier.  Yet, our
authority file headings sometimes get changed (from a rename of an
LCSH heading, to adding a death date to an author).  Violates
pretty much the first most basic rule of modern identifiers.

It's no surprise that an identifier system our community invented
nearly a hundred years ago before computers really existed do not
perform very well as identifiers in the present environment. But
it's still the truth. I think you're absolutely right that we
should understand these legacy controlled headings as a sort of
identifier -- that will help us understand better how to use them
and convert them in the modern environment. But important to
remember they are a sort of ancient identifier system, which is
ill-suited in several ways for the contemporary environment.

Jonathan




--
Adger Williams
Colgate University Library
315-228-7310
awilli...@colgate.edu 


Re: [RDA-L] Linked files

2011-04-25 Thread Adger Williams
I don't think I understand.
We've got identifiers.

We all do our authority updates by authority record numbers, which (by and
large) don't change.

We do change 1xx forms, which one should perhaps think of as "preferred
display forms", and I think it would be unwise to think the desire to change
preferred display forms will go away.

So, I'm not sure what the "new" part of the new world of linked data would
be here.

On Mon, Apr 25, 2011 at 10:27 AM, Jonathan Rochkind wrote:

> On 4/22/2011 1:13 PM, James Weinheimer wrote:
>
>>
>> There is another way of looking at our headings than solely as textual
>> strings, which is not entirely correct, but rather as identifying something
>> *unambiguously*. This is exactly what our headings are designed to do. An
>> identifier does not have to be composed only of numbers, but any string.
>> This is why I have suggested reconsidering our headings *as* identifiers,
>> since catalogers have worked very, very hard for a long time to keep them
>> unique, or unambiguous.
>>
>
> I agree entirely, controlled headings from authority files ARE a sort of
> archaic version of identifiers and should be considered as such.
>
> The thing is, that they aren't all that succesful as identifiers in the
> modern environment.  For instance, just as the most obvious example, you
> NEVER want to _change_ an identifier.  Yet, our authority file headings
> sometimes get changed (from a rename of an LCSH heading, to adding a death
> date to an author).  Violates pretty much the first most basic rule of
> modern identifiers.
>
> It's no surprise that an identifier system our community invented nearly a
> hundred years ago before computers really existed do not perform very well
> as identifiers in the present environment. But it's still the truth. I think
> you're absolutely right that we should understand these legacy controlled
> headings as a sort of identifier -- that will help us understand better how
> to use them and convert them in the modern environment. But important to
> remember they are a sort of ancient identifier system, which is ill-suited
> in several ways for the contemporary environment.
>
> Jonathan
>



-- 
Adger Williams
Colgate University Library
315-228-7310
awilli...@colgate.edu


Re: [RDA-L] Cataloging playaways

2011-04-25 Thread Mark Ehlert
Jonathan Rochkind  wrote:
> One idea is if perhaps the matching algorithm could use the new 3xx fields
> instead of the 300 "type of unit" free text.  Of course, that relies on the
> new 3xx fields using only controlled terms, which I'm not sure is the case
> (but should be!).

Assuming 3xx is limited to the 336-338, then the CMC types are already
set up as controlled terms.  Hence the prescribed "other" and
"unspecified" terms provided for all three lists.

-- 
Mark K. Ehlert                 Minitex
Coordinator                    University of Minnesota
Bibliographic & Technical      15 Andersen Library
  Services (BATS) Unit        222 21st Avenue South
Phone: 612-624-0805            Minneapolis, MN 55455-0439



Re: [RDA-L] Cataloging playaways

2011-04-25 Thread Jonathan Rochkind

On 4/22/2011 3:30 PM, Deborah Fritz wrote:

People *will* be entering free text as this RDA element, so I would like to
know whether anyone has figured out some way that matching algorythms will
be able to reliably match descriptions without the use of consistent terms
in this element.


No, and nobody will. It's essentially impossible, as I suspect you knew 
when you asked. :)  "reliably match descriptions without the use of 
consistent terms in this element."


One idea is if perhaps the matching algorithm could use the new 3xx 
fields instead of the 300 "type of unit" free text.  Of course, that 
relies on the new 3xx fields using only controlled terms, which I'm not 
sure is the case (but should be!).


I had sort of been thinking (perhaps incorrectly?), that the new 3xx 
fields will be from controlled terms, and serve the purposes of machine 
matching and collocation, while the old 300 will be for user display 
only. It is not an unusual pattern in our data to have controlled fields 
corresponding with transcribed fields (although in this case it's not 
transcribed exactly, but still perhaps a display field rather than a 
controlled/collocating field).


Does that make any sense?

Jonathan


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind

On 4/22/2011 1:13 PM, James Weinheimer wrote:


There is another way of looking at our headings than solely as textual 
strings, which is not entirely correct, but rather as identifying 
something *unambiguously*. This is exactly what our headings are 
designed to do. An identifier does not have to be composed only of 
numbers, but any string. This is why I have suggested reconsidering 
our headings *as* identifiers, since catalogers have worked very, very 
hard for a long time to keep them unique, or unambiguous. 


I agree entirely, controlled headings from authority files ARE a sort of 
archaic version of identifiers and should be considered as such.


The thing is, that they aren't all that succesful as identifiers in the 
modern environment.  For instance, just as the most obvious example, you 
NEVER want to _change_ an identifier.  Yet, our authority file headings 
sometimes get changed (from a rename of an LCSH heading, to adding a 
death date to an author).  Violates pretty much the first most basic 
rule of modern identifiers.


It's no surprise that an identifier system our community invented nearly 
a hundred years ago before computers really existed do not perform very 
well as identifiers in the present environment. But it's still the 
truth. I think you're absolutely right that we should understand these 
legacy controlled headings as a sort of identifier -- that will help us 
understand better how to use them and convert them in the modern 
environment. But important to remember they are a sort of ancient 
identifier system, which is ill-suited in several ways for the 
contemporary environment.


Jonathan


Re: [RDA-L] Linked files

2011-04-25 Thread Jonathan Rochkind

On 4/21/2011 7:27 PM, J. McRee Elrod wrote:

Karen Coyle said:


Linking is not the same as using identifiers rather than text strings
for entities, although both are considered "best practices" and
linking depends greatly on clear identification.

So these identifiers link to *inhouse* files? "Shakespeare" once
rather than repeated as author, added entry, and subject, in multiple
bibliographic records?


Both. The identifiers are _global identifiers_, which link to a file 
that would be known by the same name by anyone sharing the same 
authority file.  However, you would very likely want to mirror those 
files locally for performance and reliability.