Re: [RDA-L] Apocrypha

2011-11-11 Thread Jim Weinheimer
On Thu, Nov 10, 2011 at 3:22 PM, Armin Stephan wrote:
snip

  The work Genesis is the work genesis. I see no need for any
 qualifier at all.

 (AACR cataloguers use to qualify everything. German cataloging tradition
 shows, that it is possible to use less qualifiers.)

/snip

I would just like to point out the Wiki disambiguation page for Genesis:
http://en.wikipedia.org/wiki/Genesis

As I have pointed out before, the disambiguation pages of Wikipedia are one
area where we can see a huge improvement over our traditional library
tools. I can't imagine anybody preferring our methods to a page like this.

Still, even they add several qualifiers.

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-09 Thread Jim Weinheimer
 On 08/11/2011 22:15, Jonathan Rochkind wrote:
snip

Kind of off topic, but curious why you don't think relator codes are the
right thing to do. If we're listing 3 or 5 or 10 people or entities
'responsible' for an artistic work, why wouldn't we want to be able to say
the nature/role of each entities responsibility?  Or, if we do, but relator
codes are a poor device for this, why?

/snip

I answered this in another posting that can be found here
http://catalogingmatters.blogspot.com/2011/03/re-question-about-rda-title.html

While I have nothing against the relator codes *in theory* I think there
are serious practical barriers. Entering the relator codes entails
additional work for catalogers and some will not be so simple, but more
important, there is the serious problem of legacy data. If catalogers had
been adding the relator codes all along, that would be one thing, but the
decision was made back then not to add them. We must admit that those
records will not be updated.

Therefore, when looking at the situation from the *patron's point of view*,
they will still--always--have to check and recheck every single citation
generated from a library catalog because there may be editors, compilers
and others who must be cited as such. I see this leading to tremendous
confusion and anger. Remember, these are the same people who are not
supposed to be able to understand abbreviations such as p. and et al.
(except in citations, of course!).

I don't think it is wise to promise more than we can deliver.

-- 
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-09 Thread Jim Weinheimer
 On 08/11/2011 22:21, J. McRee Elrod wrote:
snip

See Chicago Manual of Style 14th ed. 16.35-38. Up to three authors may be
given, but only the first is given in inverted order. Sounds like a main
entry to me. One has to choose one to invert. Beyond three, only the first
is given. (Entry under first of more than three is closer to RDA than
AACR2, but like AACR2 in substituting et al. for additional authors.) Am
I the only one old enough to remember more than one author at the top of
the unit card? But *one* was first.

/snip

Well, I beg to differ since I don't see that mere inversion of the name
that happens to be first on an item to be the equivalent to the selection
of a main entry. Everyone on this list is fully aware that the rules for a
single main entry are terribly complex. The same thing happens when you
have four, five, or more names.

Certainly,  *in a bibliographic citation* a single one of all the authors
has to come first, but not in a computerized catalog where displays are (or
can be) much more fluid. Articles can get wild, e.g.
http://www.sciencemag.org/content/291/5507/1304.short. Who wants to trace
all of them?! Yet, in the bibliographic citation entry for this item, it
would be the first three to seven authors, with the first one inverted. Who
can maintain that the first person here is equivalent to a *single main
entry*? In the future, I would predict that monographs (whatever form they
become) could very possibly approach this level of complexity.

In any case, there is no reason why Johnson should be treated subordinately
to Masters, except to maintain our old practice of a single main entry.
Many bibliographic databases do just fine without the concept of a single
main entry. Look at Amazon with three authors
http://www.amazon.com/Masters-Johnson-Sex-Human-Loving/dp/0316501603/ref=sr_1_1?ie=UTF8qid=1320790524sr=8-1.
If you look at the cover in the Look Inside (I can't see the t.p.),
Masters is first, but in the citation Kolodny is first. In the CIP,
Masters retains main entry. Dublin Core also avoids a single main entry.

Why continue this practice when there are three equal authors or more? In a
card or printed catalog, I freely agree that matters are quite different
but in a database, matters are completely different.

If we could get rid of those complex rules, cataloging would become
simplified a bit and access would remain the same if not improved.

Still, I realize that I cannot convince you of this, so we can agree to
disagree. Yet, wouldn't it be great to at least allow the possibility of
something like this? In ISO2709, allowing for such a possibility would be
terribly difficult, but as I tried to show in XML, it is almost child's
play.

-- 
James Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/


Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-08 Thread Jim Weinheimer
On Tue, Nov 8, 2011 at 7:01 AM, Hal Cain wrote:
snip

 However, once I began to see how competent systems handled MARC, it became
 plain that what they were doing was basically to create a matrix and
 populate it with the tag values, the indicator values, and the subfield
 data prefixed by the subfield code.  Then the indexing routines read the
 matrix (not the raw MARC ISO2709 data) and distributed the data into the
 appropriate areas of the system's internal table structure.  From those
 tables, I was able, when required, to obtain what I wanted by direct query
 on the appropriate part of the database. When it was necessary to export a
 single MARC record, a group of them, or indeed the whole database, the
 system had routines which reversed the process (and, last of all, counted
 the number of characters in order to fill in the record length element of
 the MARC leader). This was extremely burdensome to programmers who came to
 the game in the 1990s and had no background in early data processing,
 chiefly of text rather than numbers, but in its time it was pure genius.
 Nowadays it's a very special niche, and the foreignness to programmers and
 designers of the processes involved probably plays a part in keeping us
 from having really good cataloguing modules and public catalogues; and I
 can understand the frustration entailed for those who expect to interrogate
 a database directly.

 Bear in mind, though, that using a modern cataloguing module (Horizon is
 the one I'm most familiar with), I can search for a record on a remote
 system, e.g. the LC catalog, through Z39.50, and have the record on my
 screen, in editable form, in a second or two, indistinguishable from a
 record in the local database. The system's internal routines download the
 record in MARC format (ISO 2709, hated by Jim) and build the matrix which
 feeds the screen display.

/snip

This is really a nice description of the problems of ISO2709, Hal. Thanks a
lot.

I would like to clarify one point however: do I hate ISO2709 format? I can
answer that honestly: no. It served its purpose well for the environment it
was born into. That environment changed however, and we need to face up to
that. If our modern systems (i.e. modern web browsers) worked with the
ISO2709 format, i.e. the files that the machine actually receives, then I
would be all for it.

Yet, this is not the reality of the situation. Browsers work with a variety
of formats, but they work with XML, which gives us some options. Browsers
do not work with ISO2709, and I don't believe they ever will. Therefore,
the only systems that can work with ISO2709 records (which is how libraries
exchange their cataloging information) are other catalogs, and that
automatically restrains us from participating in the wider information
universe. As a result, in my own opinion, hanging on to ISO2709 borders on
the irrational since we automatically limit the utility of our records,
thereby limiting ourselves.

MARCXML has many limitations that I won't discuss here, but *at least* it
is in XML which *can* be used in the new environment. It is much more
flexible than ISO2709. For instance, I have mentioned before that I believe
we should get away from a *single* main entry--that while a single main
entry made sense in the card catalog, it makes no sense in a computerized
catalog. Others disagree, but no matter what, I think it is vital that we
should have that kind of flexibility.

Getting rid of a *single* main entry would be the equivalent of DC's
creator and contributor where creator is repeatable, thereby creating
multiple main entries. It turns out this is much more difficult than merely
making 1xx repeatable, since you also have to allow it in the 6xx, 7xx and
8xx, for books *by* Masters and Johnson, for books *about the books*
written by Masters and Johnson, for analytical and series treatments as
well.

You could do this without too much difficulty in XML, even in MARCXML, but
in ISO2709, it would be a relative nightmare because you would have to
rework the entire structure, from the directory on down. (This is why the
MARCXML principle of roundtripability--what a word!--needs to be dropped.
Otherwise, we remain trapped in the ISO2709 format anyway!) Anyway, while
it may be possible to rework ISO2709 to such an extent, would it be
worthwhile to do it on such an old format?

This is just one example of the relative inflexibility of ISO2709, but
there are many more.

Still, I don't hate ISO2709. It served its purpose admirably, but it's like
the horse and buggy. I'm sure nobody hated horses and buggies after the
automobile came out, but eventually, if it turned out that Dad and Grandpa
refused to get a car when everybody else had one and the advantages were
plain for all to see, Junior very possibly would have wound up hating the
horse and buggy he was forced to use.
-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative 

Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-07 Thread Jim Weinheimer
On Mon, Nov 7, 2011 at 8:00 AM, Bernhard Eversberg wrote:
snip

 Jim, ISO2709 is a nuisance, agreed. And I dislike it no less than you
 do because I'm a real programmer and know what it feels like.
 But don't let's get carried away and rush to premature conclusions with
 inappropriate metaphors. Rather, consider this:
 Would you tear down your house and rebuild it from the ground up
 if the old wallpaper gives you the creeps?

 For that's what ISO2709 is: mere wallpaper. Easily replaced or painted
 over. Nothing serious, nothing that affects any qualities of the building.

/snip

I wish that were true. ISO2709 is the standard way libraries exchange their
records, and this means that anybody who wants library information must
work with ISO2709. ISO2709 was designed to make catalog cards, and that is
what it still does today, only the cards are not printed on card stock,
they are printed on the computer screen. Certainly, they can be searched in
ways different from a card catalog, but this is because of the mere fact
that they reside in the computer--not because the format is any more
amenable to searching.

Today, most web developers I know do not want to copy and reformat and
maintain duplicates of records that are on different systems.
They want much more to interoperate with them, and they can do this through
various APIs. For instance, I can add a Google Books API that will
search--in the background--Google Books in all kinds of ways and return one
record, or multiple records. It does not give me the entire Google metadata
record, nor do I want it. (as ISO2709 does--by definition) I want to work
with the Google metadata *on the fly* so that I do not have the
responsibility to keep the record current, reformat it and have to do all
kinds of additional work. Keeping the record current is Google's
responsibility--not mine and I shouldn't have to do it.

With ISO2709, it is designed to transfer a complete catalog record from one
catalog into another catalog. It is not designed for interactivity. Here is
a practical example. At LC, they have lots of sets of records where you can
interact with them http://memory.loc.gov/ammem/oamh/oai_request.html. So, I
could have a local catalog on e.g. dance, and I could search behind the
scenes--if I set the machine correctly--the records for LC's dance
instruction manuals. I can display these records as I wish because they are
in XML. I would not have to download all records in ISO2709, convert them
in MARCEdit, put them into my own database, where the URLs and other
information may change in the future, since potentially it is a ton of work
to maintain records for materials on the web.

Another example is the Worldcat Search API
http://www.worldcat.org/affiliate/tools?atype=wcapi. There is no mention of
ISO2709 there. Plus, I implemented the Worldcat Citations API when I was at
AUR:
http://www.oclc.org/developer/documentation/worldcat-search-api/formatted-citations
and an example:
http://www.galileo.aur.it/cgi-bin/koha/opac-detail.pl?bib=24135. In the
right-hand column, you will see Get a Citation. When you click it, you
will see citation formats (in XML, not ISO2709) taken on the fly from
Worldcat and reformatted by the system I created. This is a simple example
and matters could become much more complex, if someone desired.

The fact is, most developers want to work with APIs in these kinds of ways
instead of having to download, convert (mostly an extremely difficult job
to come out with anything coherent), upload into your own system, and then
maintain those records. That is horribly inefficient, and unnecessary,
today.

Why don't more developers work with library metadata? To me, the answer is
absolutely obvious. We are not making APIs that developers want to work
with, and one reason is that we keep maintaining that if somebody wants our
information bad enough, it is easy to work with ISO2709 records by
downloading, reformatting, etc. but that is wrong. Working with APIs is
what is easy and if you use ISO2709 you absolutely cannot do that.

Developers don't want--or need--to jump through all of those hoops when
they don't have to, and they prefer to work with other systems. So they
don't use our records and prefer, e.g. Amazon, which has all kinds of APIs.

Unfortunate. But perhaps it is something that the Bibliographic Framework
will address and our metadata will be more usable in the information
universe.

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-07 Thread Jim Weinheimer
On Mon, Nov 7, 2011 at 10:21 AM, Bernhard Eversberg wrote:
snip

 Jim, my point is, in nuce:
   Yes, MARC is horrible, but ISO is not the reason.

 You wrote:


 With ISO2709, it is designed to transfer a complete catalog record
 from one catalog into another catalog.



  Yes, but Web services on any MARC based catalog need not suffer
 from that, Web services can be constructed without paying any attention
 to the ISO structure. I said that much in my post. It is regrettable
 that up until now we still have not many useful web services as part
 of library OPACs. But the reason for this is certainly not ISO2709.

/snip

Have you ever seen or heard of a web service based on ISO2709? What then
will be the purpose of ISO2709 except one: to transfer a catalog record
from one library catalog to another?

But this now appears to be the second aspect of MARC, which is what most of
the discussion is about, not about ISO2709 itself, but the coding, e.g.
100b 300c and so on. In one sense, this is much less of a problem because
we are talking about mere computer codes, and those codes can display
however someone wants them to display.

So, when developers say that they don't like MARCXML, this is a lot of what
they are talking about since they want and expect the coding to say title
and date of publication and they don't want to look up what 245a or 300c
means. (There are also the codes that must be dug out of the fixed fields
such as the type of dates and dates in the 008, the language code, etc. but
that is yet another matter)

Of course, we run into the problem of library jargon here, since 245a is
not title but title proper and not only that, it includes the
alternative title plus it includes individual titles when an item lacks a
collective title. There may be some more nuances as well. Therefore, 245a
implies separate access to a lot of other types of titles. Non-cataloger
developers cannot be expected to know or understand any of this. So, if the
format codes it title, that is misleading, while coding it as
titleProper, developers will just think it's a weird name for a title.

This is complicated and at the moment I don't know how it can be solved.
Perhaps our traditional library distinctions will disappear in the new
environment, but I hope not.

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] Offlist reactions to the LC Bibliographic Framework statement

2011-11-07 Thread Jim Weinheimer
On Mon, Nov 7, 2011 at 11:05 AM, Bernhard Eversberg wrote:
snip

 But be that as it may, my point is that
 even for this function, it is no longer technically necessary.
 For all intents and purposes, MARC may live on forever without
 the need to deal with ISO2709. It is technically obsolete, but we
 need not care.


Perhaps it will live on as one developer described, when last week at lunch
we were discussing the old days of the ISO2709 format for AGRIN3 data
that he (and I and everybody) had to work with before we all changed it to
XML.

He mentioned that he keeps the specifications in a drawer of his desk as a
momento mori. Once in awhile he takes them out just to gaze upon and to
remind himself of other realities!

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] NISO offers itself as the standards body for future format

2011-11-03 Thread Jim Weinheimer
On Thu, Nov 3, 2011 at 8:45 AM, Bernhard Eversberg wrote:
snip

 Help with the creation of a new format would be great. What the library
 world needs here is, of course, an indefinite term commitment.
 And what we also need is a free and open standard, or else we can
 forget everything about opening up to other communities and freeing
 our data in the web for everybody to use. Libraries are there to
 make recorded knowledge universally available and useful. To assist
 this, today, they have to make their data universally available
 and useful, and with that huge body of data, the conventions that
 constitute its foundation. What we have instead is one not universally
 open entity in control of the data and another one in possession of the
 rules. Now, the format is to go into custody of a third?

/snip

Good point. I had simply assumed that what they make would be free. It
appears as if they do make them available for free, e.g. the Digital
Talking Book Standard at
http://www.niso.org/workrooms/daisy/Z39-86-2005.html. They also say
explicitly that they are available at no cost: All NISO standards are
protected by copyright. NISO standards can be downloaded and reproduced for
noncommercial purposes only. NISO standards cannot be translated, modified,
redistributed, sold or repackaged in any form without the prior permission
of NISO. http://www.niso.org/standards

Still, this needs to be made very clear. For instance, I can imagine
libraries--and individual libraries--wanting to add their own namespaces to
whatever NISO would make, so the word modified would have to be
considered carefully. Plus, the translation makes me hesitant, although I
understand.

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] Radical proposal for RDA inclusions

2011-10-28 Thread Jim Weinheimer
On Fri, Oct 28, 2011 at 7:41 AM, Bernhard Eversberg wrote:
snip

 I see two big issues here (among many more lesser ones) that should not be
 taken too lightly:

 1. MARC as input standard has made sure that it was (more or less) the
   same everywhere. Someone trained at X could go to work at Y
   immediately without a lot of retraining.

 2. Dealing with raw data at the person-machine interface of data
   input has at least two advantages:
   -- Directness: What you see is what you get, no layers of
  transformation and interpretation between you and the data.
   -- Ease of human communication: The format became the very language
  of catalogers' talk about the data; precise, succinct,
  unambiguous, international (numbers, not words!). Just listen
  in on any AUTOCAT discussion.

 For all the flaws of MARC, these are great advantages.
 Considering what modern systems can do, there could be any number of
 highly convenient but widely different input systems. As soon as two
 different ones are adopted at X and Y, points 1. and 2. are both lost.
 And then, modern input systems will evolve, they will change over
 time, get refined, modified, replaced by new designs. What will that
 mean for the productivity of the cataloging workforce? And how are
 they going to talk on AUTOCAT, for instance?

/snip

As always, you ask some great questions and I certainly don't have any
answers.

Even catalogers don't work with the raw data format of MARC (don't worry. I
won't begin my ISO2709 diatribe again!) but they are looking at a formatted
display. Taking this further, the display catalogers work with could easily
show human-language explanations instead of numbers, as many catalogs do
now, since they often show the field/subfield along with the description.

Still, the numbers for the fields and subfields allow a degree of almost
scientific accuracy when discussing catalog issues that I don't think can be
easily replicated into human language.

*Perhaps* the new RDF coding could be the solution, but at least to me, the
very idea of catalogers speaking in RDF triples somehow brings to mind
images from some of the wilder scenes of sci-fi/terror movies or the show
The X Files. It's enough to give me the shivering horrors!

-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/
Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] Super MARC to code RDA?

2011-09-30 Thread Jim Weinheimer
On Fri, Sep 30, 2011 at 12:42 AM, Kevin M Randall wrote:
snip

 By one catalog, are you referring to that little thing I keep bringing
 up, the Ex Libris Voyager system?  That is one product, but many thousands
 of catalogs around the world.  (Including a catalog at this quaint place
 you may have heard of, the Library of Congress.)

 Why the records are stored (and used) in Voyager in that format, I don't
 know for sure.  But I can only assume it is because that happens to be the
 most efficient way of using system resources.  Yes, there are many other
 tables to support indexing, but the full bib record exists ONLY in the MARC
 21 ISO2709 format (albeit placed within a field of an Oracle table, and
 sometimes broken up into multiple rows of the table, depending on length of
 the ISO2709 string).  When exporting, the records are not recompiled but
 are rather copied directly from the ISO2709 strings.

 Actually, I would be rather surprised if it turns out that Voyager is alone
 among the major players.

/snip

I have no special love or hatred for the ISO2709 version of the record. I
honestly couldn't care less. We could retain it for our communications
format IF we could show that it is as flexible as XML and also, widely
utilized by the different software developers out there. I don't see that
happening with ISO2709. Plus, how a specific database/catalog wants to store
its information internally is a matter of practically no concern to
catalogers but is a concern to the database designers. Catalogers should
care that the system stores and retrieves everything reliably.

snip
While I also agree that numbers and other language-neutral tags have their
advantages, I really don't think it's necessary to have them in a new
metadata carrier.  If things are done right this time around, catalogers
will NOT, NOT, NOT be working with records in the native language of the
metadata carrier.  Just as there is absolutely no excuse for requiring
catalogers in this day and age to have to work with MARC tags, indicators,
and subfield codes, there should be absolutely no excuse for requiring them
to work with constructs such as (to quote an example from Diane Hillmann's
Getting Real with RDA presentation):

   rdarole:authorhttp://lcnaf.info/79062641/rdarole:author

That is how it might look behind the scenes, but the cataloger should NEVER
have to see this unless it's explicitly asked for!  But if that's what
catalogers end up being given to work with, then I will really be convinced
that systems vendors really do have the utmost contempt for catalogers...
/snip

I don't know if I agree with this. With codes and numbers, everybody knows
exactly what it all means. With words, it gets messy. For instance, I
discovered that in ITunes U people can add metadata.
http://tinyurl.com/5v3gz5p Well, if you look at it, you find a table of
suggested uses for the fields and one is highly interesting:

Name: Track title, for example, Easter Island and Darwin or Digital
Storytelling

Name???!!! And then it immediately says Track title. And if this makes
little sense to us, imagine someone with very little English trying to
figure it out! Standards demand rigor and the reality is, much of it winds
up being communicated in, what seems to an untrained person, to be gibberish
since the purpose is to communicate very precisely.

So, while I don't care about the coding, be it in words or numbers or
musical notes--it's just computer codes, after all and literally the same to
the computer!--I do care very much about how people interpret those codes.
For someone who sees 245$a, they will be forced to look it up and find
Title proper which they will not understand, and then they will have to
look up what a title proper is, on the way learning about alternative
titles, uniform titles and all kinds of other titles that the non-librarian
does not know about. After ITunes U, how will people interpret Name?

Yet as I said, I fought the good fight to try to get people to retain the
numbered fields and subfields, but gave up. The method of communication will
be in words. One of my concerns with words is a vision I have had that
each language group will eventually want to free themselves from the English
language and very logically demand the equality of their own languages.
Then, all these versions will be made, and the final situation will be just
as bad or worse than all the versions of MARC

Oh well, I lost that one.


-- 

James L. Weinheimer  weinheimer.ji...@gmail.com
First Thus: http://catalogingmatters.blogspot.com/



Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules


Re: [RDA-L] libraries, society and RDA

2008-11-03 Thread Jim Weinheimer
Casey Mullin wrote:

I am encouraged at where this thread has turned this evening...

Shawne's comments about continuing to create catalogs are apt. What I've
come to realize in the past few years is that it's not the fundamental
intellectual activity of catalogers which is in danger of obsolescence.
Rather, the real liability to the continued relevance of our profession
(IMHO) is the gross duplication of effort which goes on in the library
community. Publishers and vendors maintain large databases, and even users
are creating robust metadata. Yet, with all of this redundant data in the
universe, catalogers still spend so much time transcribing and following
meticulous rules for doing so. I believe RDA is an attempt to move away from
this redundancy. Indeed, in the information universe now and into the
future, we should focus less on our abbreviation and punctuation, and free
our minds and our energies for the *real* work (again, IMHO) of cataloging:
creating *relationships* and putting resources and collections in context.
--



If the final product from RDA will be that redundancy is greatly reduced and
that publishers and vendors will provide us with more of the basic
information, than I am 100% in favor. The problem is, I don't see at all how
or why, if publishers are reluctant to provide AACR2-quality metadata, why
they will be more willing to provide the library community with RDA-quality
metadata. RDA is not simpler-in fact, there's not that many major changes
that I have seen--its language is still complex, although it may not be
library-cataloger jargon, it is still the dense jargon of information
technologists.



Catalogs will always exist, although we may not recognize them as such. They
may be poor, incomplete and difficult to work with, but so long as you have
a collection, there must be some kind of guide into it. Google is a catalog
in this sense. It works differently from a traditional library catalog, but
it does allow people to find some materials. It doesn't cover 100% of the
web and it is very quirky, but without something like Google, the web would
be completely useless-just as useless as a huge library without a catalog.
There have been some strange catalogs in the past as well (one ancient one
was written in verse, if memory serves), including some produced by
libraries. I remember examining one old catalog where, if there was no clear
author to assign main entry, the cataloger wrote the book record into a
separate section: Anonymous works which had thousands of records in random
order! Totally useless! I guess the cataloger couldn't figure out title
entry or corporate entry (until Panizzi, I believe).



But the final sentence above captures the essence very well: we must focus
on the real work, which is to accurately describe and organize the materials
in the collection in all sorts of ways. The concept of the collection has
changed already for our users and it must change for us as well.



James Weinheimer  [EMAIL PROTECTED]

Director of Library and Information Services

The American University of Rome

via Pietro Roselli, 4

00153 Rome, Italy

voice- 011 39 06 58330919 ext. 258

fax-011 39 06 58330992





Re: [RDA-L] FRBR user tasks (was: Alternatives to AACR2/MARC21?)

2008-10-24 Thread Jim Weinheimer
-Original Message-
From: Resource Description and Access / Resource Description and Access
[mailto:[EMAIL PROTECTED] On Behalf Of Rhonda Marker
Sent: Thursday, October 23, 2008 4:29 PM
To: RDA-L@INFOSERV.NLC-BNC.CA
Subject: Re: [RDA-L] FRBR user tasks (was: Alternatives to AACR2/MARC21?)

I've been lurking since the list began, but will dart out into the open
just this once.

You ask what difference it will make to try to bring both precision and
recall to searches (my vocabulary, not yours). For some tasks, such as
finding enough to write an undergraduate essay, perhaps itadds very
little beyond what a simple keyword search would accomplish. For other
tasks, a more comprehensive result is vitally important-- for graduate
level research, for many STM (science-technology-medicine) topics, and
I'm sure others could give many more categories of things and even
specific instances. I guess my point is that the General Searcher is not
our only user. Having a model like FRBR helps us organize our efforts so
that whatever our resources allow us to do, we can do in a purposeful,
concerted way.
---

I haven't read that FRBR or RDA will make it easier for people to find
things, except in the sense that the FRBR displays may provide more useful
collocation of similar records (by work/expression/manifestation/item).

And when we bring precision and recall into the equation, I don't know if
this has anything to do with RDA or even with human cataloging.
Traditionally, precision and recall have had to do with evaluating the
results of automated keyword searching, where they have been seen as
canceling one another out, i.e. the greater the precision, the lower the
recall, or the greater the recall, the lower the precision. A good
discussion of this is at:
http://www.tbray.org/ongoing/When/200x/2003/06/22/PandR. Essentially, a
keyword search in a full-text database for baths of Titus could bring up
information on the Baths of Titus here in Rome (which is what I would want)
but it could--who knows?--bring up items about growing fish in India or
pornographic sites. Google has gotten around this with their Page Rank
system.

The traditional method for evaluating human indexing uses different
measures: specificity and exhaustivity. If you search human-created indexing
terms you would never get the completely out-of-bound results mentioned
above (unless the human could not read the text at all), but there can be
other problems. The example I use (from my own practice when I was still
learning) is a book I had to catalog about the legal rights and
responsibilities of pregnant women and new mothers in the Soviet Union. I
found copy from a well-known law library that will remain nameless, and
found the single subject: Women--Soviet Union. While it is not as out of
bounds as what you might get in a full-text keyword search, it is still
wrong from a human indexing point of view.

So, we can make it a bit more specific, but even if we put in Pregnant
women--Soviet Union, that would not have been specific enough because we
have to add the legal aspects. But if we left it there, it still would not
be sufficiently exhaustive because we need something for new mothers.

I have seen several people mix up the evaluation of human and computer
indexing, and the page I gave above appears to do just that. Or perhaps the
official definitions have changed and I'm just behind the times, I don't
know. But I still don't believe that instituting FRBR or RDA will have any
effect on either precision/recall or specificity/exhaustivity.

James Weinheimer