date:20120522

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Arash.Joorabchi

Thank you Roy and Simon for the info.

As for your second point, I suppose one advantage of using the WorldCat
API at this experimental stage is that the returned bib records are
already FRBR-ized.

Ross - Thanks for the link of Open Library data dump. WorldCat
collection is 2 orders of magnitude larger than open library which makes
a significant difference considering the skewness and sparsity of bib
records classified according to library taxonomies, e.g., DDC, LCC (for
more info, see:
http://cdm15003.contentdm.oclc.org/cdm/singleitem/collection/p267701coll
27/id/277/rec/28)


Thanks,
Arash


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Simon Spero
Sent: 22 May 2012 19:47
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
without a DDC no from the result set

Arash - you might not want to use a straight dump of worldcat catalog
records- at least not without the associated holdings information.*

There are a lot of quasi-duplicate records that are  sufficiently broken
that the worldcat de-duplication algorithm refuses to merge them.  These
records will usually only be used by a handful of institutions;  the
better
records will tend to have more associated holdings.  The holdings count
should be used to weight the strength of association between class
numbers
and features.

Also, since classification/categorization is something that is usually
considered to be a property of works, rather than manifestations, one
might
get better results by using Work sets for training.

I would suggest, er, contacting  Thom Hickey.

Simon

* Well, not precisely holdings - you just need the number of distinct
institutions with at least one copy.  I call them 'hasings'.

On Sat, May 19, 2012 at 8:42 PM, Roy Tennant 
wrote:

> Arash,
> Yes, we have made WorldCat available to researchers under a special
> license agreement. I suggest contacting Thom Hickey
> about such an arrangement. Thanks,
> Roy
>
> On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi

> wrote:
> > Dear Karen,
> >
> > I am conducting a research experiment on automatic text
classification
> and I am trying to retrieve top matching bib records (which include
DDC
> fields) for a set of keyphrases extracted from a given document. So, I
> suppose this is a rather exceptional use case. In fact, the right
approach
> for this experiment is to process the full dump of WorldCat database
> directly rather than sending a limited number of queries via the API.
> >
> > I read here:
> > http://dltj.org/article/worldcat-lld-may-become-available
under-odc-by/
> > that WorldCat might become available as open linked data in future,
> which would solve my problem and help similar text mining projects.
> However, I wonder if it is currently available to researchers under a
> research/non-commercial use license agreement.
> >
> > Regards,
> > Arash
> >
> > -Original Message-
> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of
> Karen Coombs
> > Sent: 17 May 2012 08:37
> > To: CODE4LIB@LISTSERV.ND.EDU
> > Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of
records
> without a DDC no from the result set
> >
> > I forwarded this thread to the Product Manager for the WorldCat
Search
> > API. She responded back that unfortunately this query is not
possible
> > using the API at this time.
> >
> > FYI, the SRU interface to WorldCat Search API doesn't currently
> > support any scan type searches either.
> >
> > Is there a particular use case you're trying to support? Know that
> > would help us document this as a possible enhancement.
> >
> > Karen
> >
> > Karen Coombs
> > Senior Product Analyst
> > Web Services
> > OCLC
> > coom...@oclc.org
> >
> > On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi

> wrote:
> >> Hi Andy,
> >>
> >>
> >>
> >> I am a SRU newbie myself, so I don't know how this could be
achieved
> >> using scan operations and could not find much info on SRU website
> >> (http://www.loc.gov/standards/sru/).
> >>
> >> As for the wildcards, according to this guide:
> >>
>
http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
> >> rchworldcatquickreference.pdf the symbols should be preceded by at
least
> >> 3 characters, and therefore clauses like:
> >>
> >>
> >>
> >> ... AND srw.dd=*
> >>
> >> ... AND srw.dd=?.*
> >>
> >> ... AND srw/dd=###.*
> >>
> >> ... AND srw/dd=?3.*
> >>
> >>
> >>
> >>
> >>
> >> do not work and result in the following error:
> >>
> >> Diagnostics
> >>
> >> Identifier:
> >>
> >> info:srw/diagnostic/1/9
> >>
> >> Meaning:
> >>
> >>
> >>
> >> Details:
> >>
> >>
> >>
> >> Message:
> >>
> >> Not enough chars in truncated term:Truncated words too short(9)
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Arash
> >>
> >>
> >>
> >> 
> >>
> >> From: Houghton,Andrew [mailto:hough...@oclc.org]
> >> Sent: 16 May 2012 11:58
> >> To: Arash.Joorabchi
> >> Subject: Re: [CODE4LIB] Worl

Re: [CODE4LIB] archiving a wiki

2012-05-22 Thread Carol Hassler

Yup, something like that! But for JSPwiki :) 

JSPwiki has an extension to export as PDF, but it doesn't do multiple
pages without some extra work each time. We're hoping to find something
quick and automated so we can archive quickly and move on!

>>> Dave Caroline  5/22/2012 4:08 PM >>>
On Tue, May 22, 2012 at 10:04 PM, Carol Hassler
 wrote:
> My organization would like to archive/export our internal wiki in
some
> kind of end-user friendly format. The concept is to copy the wiki
> contents annually to a format that can be used on any standard
computer
> in case of an emergency (i.e. saved as an HTML web-style archive,
saved
> as PDF files, saved as Word files).

something like ?
http://www.mediawiki.org/wiki/Extension:DumpHTML

Dave Caroline

Re: [CODE4LIB] archiving a wiki

2012-05-22 Thread Dave Caroline

On Tue, May 22, 2012 at 10:04 PM, Carol Hassler
 wrote:
> My organization would like to archive/export our internal wiki in some
> kind of end-user friendly format. The concept is to copy the wiki
> contents annually to a format that can be used on any standard computer
> in case of an emergency (i.e. saved as an HTML web-style archive, saved
> as PDF files, saved as Word files).

something like ?
http://www.mediawiki.org/wiki/Extension:DumpHTML

Dave Caroline

[CODE4LIB] archiving a wiki

2012-05-22 Thread Carol Hassler

My organization would like to archive/export our internal wiki in some
kind of end-user friendly format. The concept is to copy the wiki
contents annually to a format that can be used on any standard computer
in case of an emergency (i.e. saved as an HTML web-style archive, saved
as PDF files, saved as Word files).
 
Another way to put it is that we are looking for a way to export the
contents of the wiki into a printer-friendly format - to a document that
maintains some organization and formatting and can be used on any
standard computer.
 
Is anybody aware of a tool out there that would allow for this sort of
automated, multi-page export? Our wiki is large and we would prefer not
to do this type of backup one page at a time. We are using JSPwiki, but
I'm open to any option you think might work. Could any of the web
harvesting products be adapted to do the job? Has anyone else backed up
a wiki to an alternate format?
 
Thanks!
 
 
Carol Hassler
Webmaster / Cataloger
Wisconsin State Law Library
(608) 261-7558
http://wilawlibrary.gov/

[CODE4LIB] Job: NC LIVE Web & User Experience Development Librarian at NC LIVE

2012-05-22 Thread jobs

NC LIVE Web & User Experience Development Librarian

  
Vacancy Announcement

  
  
NC LIVE seeks candidates with a passion for helping people find, use, and
create information that improves their communities and their lives through
digital library collections and services. We need your
enthusiasm, ideas, and unique skills to expand the digital possibilities of
the state's academic and public libraries. If you are
looking to join an organization with a track record of success that will use
your ideas to help build the next generation of libraries in North Carolina,
come join us at NC LIVE.

  
Known for its leadership in collaborative online library success, NC LIVE
seeks an innovative, curious, and flexible colleague to join a seven-member
team of librarians and information technology professionals serving the state
from NCSU Libraries on the campus of NC State University.
This newly-created position reports to the NC LIVE Executive
Director.

  
  
  
Responsibilities

  
The Web/UXD Librarian will have primary responsibility to design, develop, and
maintain the web and mobile interfaces of NC LIVE's digital library services
and collections.

  
Web and Application Development

  
Provide hands on leadership and vision in the development, support,
integration, and administration of NC LIVE's digital library websites,
portals, and discovery systems

Conduct ongoing research into the development of new digital library interface
capabilities, enhancements, and user-centered design trends

Partner with member library staff to maximize access to and use of NC LIVE's
digital collections through creative web design that improves the patron's
discovery experience

Provide frontline support for digital library services through NC LIVE's
content management systems and library relations systems

Participate in consortial planning, by serving on committees, taskforces, and
teams

  
  
Member Relations and Outreach Support

  
Track and identify trends in use and user behavior to assist colleagues in
building orientation, awareness, and training initiatives for member libraries

Provide digital library consulting services to member libraries

Build relationships with member librarians and digital library service vendors
to ensure the best match of service to organizational needs

  
  
Qualifications

  
Required:

  
ALA-accredited MLS, or equivalent degree in library or information science

Relevant experience, including design and development of digital applications
library services in an public, academic, or special library environment

Knowledge of and experience with current and emerging web development
technologies as they contribute to digital library services and user
experience of students, library patrons, or researchers

Demonstrated commitment to creative, high-quality digital library services

Evidence of ability for ongoing professional development and contribution

Knowledge of data standards prevalent in libraries

Relevant customer service experience in a library, educational institution, or
other knowledge-based organization

Ability to work and excel in both individual and team environments

Valid driver's license

Preferred:

  
Previous digital library development experience in a library, educational
institution, or other knowledge-based organization

Experience with search engine technologies in a library or university
environment

Experience addressing usability issues and with user-centered design in
library environments

Demonstrated experience retrieving data from open web API's

  
Overview of NC LIVE

  
NC LIVE is North Carolina's statewide online library service. Founded in 1997
by representatives from the NC Community Colleges, the NC Independent Colleges
and Universities, the NC Public Library Directors Association, the University
of North Carolina and the State Library of North Carolina, NC LIVE serves
nearly 200 member libraries across North Carolina, and is dedicated to helping
its member libraries provide North Carolinians with resources that support
education, enhance statewide economic development, and increase quality of
life.

  
Designed for at-home use, NC LIVE eBooks, magazines, newspapers, journals,
media, and other online materials are available from any Internet connection
via library websites, and through www.nclive.org.

  
NC LIVE offers free electronic access to resources for all ages on topics
ranging from careers, business, and investing, to auto repair, health,
history, and genealogy. NC LIVE resources are available to all North
Carolinians through their local public, community college, or academic
library. More information about NC LIVE can be found at:
http://www.nclive.org/about

  
  
  
Salary and Benefits

  
Salary is very competitive, commensurate with education and experience.
Position is non-tenure track faculty at the rank of Librarian. Benefits
include: 24 days vacation, 12 days sick leave; State of NC comprehensive major
medical insurance, and state,

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

2012-05-22 Thread Simon Spero

Arash - you might not want to use a straight dump of worldcat catalog
records- at least not without the associated holdings information.*

There are a lot of quasi-duplicate records that are  sufficiently broken
that the worldcat de-duplication algorithm refuses to merge them.  These
records will usually only be used by a handful of institutions;  the better
records will tend to have more associated holdings.  The holdings count
should be used to weight the strength of association between class numbers
and features.

Also, since classification/categorization is something that is usually
considered to be a property of works, rather than manifestations, one might
get better results by using Work sets for training.

I would suggest, er, contacting  Thom Hickey.

Simon

* Well, not precisely holdings - you just need the number of distinct
institutions with at least one copy.  I call them 'hasings'.

On Sat, May 19, 2012 at 8:42 PM, Roy Tennant  wrote:

> Arash,
> Yes, we have made WorldCat available to researchers under a special
> license agreement. I suggest contacting Thom Hickey
> about such an arrangement. Thanks,
> Roy
>
> On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi 
> wrote:
> > Dear Karen,
> >
> > I am conducting a research experiment on automatic text classification
> and I am trying to retrieve top matching bib records (which include DDC
> fields) for a set of keyphrases extracted from a given document. So, I
> suppose this is a rather exceptional use case. In fact, the right approach
> for this experiment is to process the full dump of WorldCat database
> directly rather than sending a limited number of queries via the API.
> >
> > I read here:
> > http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/
> > that WorldCat might become available as open linked data in future,
> which would solve my problem and help similar text mining projects.
> However, I wonder if it is currently available to researchers under a
> research/non-commercial use license agreement.
> >
> > Regards,
> > Arash
> >
> > -Original Message-
> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Karen Coombs
> > Sent: 17 May 2012 08:37
> > To: CODE4LIB@LISTSERV.ND.EDU
> > Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
> without a DDC no from the result set
> >
> > I forwarded this thread to the Product Manager for the WorldCat Search
> > API. She responded back that unfortunately this query is not possible
> > using the API at this time.
> >
> > FYI, the SRU interface to WorldCat Search API doesn't currently
> > support any scan type searches either.
> >
> > Is there a particular use case you're trying to support? Know that
> > would help us document this as a possible enhancement.
> >
> > Karen
> >
> > Karen Coombs
> > Senior Product Analyst
> > Web Services
> > OCLC
> > coom...@oclc.org
> >
> > On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi 
> wrote:
> >> Hi Andy,
> >>
> >>
> >>
> >> I am a SRU newbie myself, so I don't know how this could be achieved
> >> using scan operations and could not find much info on SRU website
> >> (http://www.loc.gov/standards/sru/).
> >>
> >> As for the wildcards, according to this guide:
> >>
> http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea
> >> rchworldcatquickreference.pdf the symbols should be preceded by at least
> >> 3 characters, and therefore clauses like:
> >>
> >>
> >>
> >> ... AND srw.dd=*
> >>
> >> ... AND srw.dd=?.*
> >>
> >> ... AND srw/dd=###.*
> >>
> >> ... AND srw/dd=?3.*
> >>
> >>
> >>
> >>
> >>
> >> do not work and result in the following error:
> >>
> >> Diagnostics
> >>
> >> Identifier:
> >>
> >> info:srw/diagnostic/1/9
> >>
> >> Meaning:
> >>
> >>
> >>
> >> Details:
> >>
> >>
> >>
> >> Message:
> >>
> >> Not enough chars in truncated term:Truncated words too short(9)
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Arash
> >>
> >>
> >>
> >> 
> >>
> >> From: Houghton,Andrew [mailto:hough...@oclc.org]
> >> Sent: 16 May 2012 11:58
> >> To: Arash.Joorabchi
> >> Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records
> >> without a DDC no from the result set
> >>
> >>
> >>
> >> I'm not an SRU guru, but is it possible to do a scan and look for a
> >> postings of zero?
> >>
> >>
> >>
> >> Andy.
> >>
> >> On May 16, 2012, at 6:39, "Arash.Joorabchi" 
> >> wrote:
> >>
> >>Hi mark,
> >>
> >>Srw.dd=* does not work either:
> >>
> >>Identifier: info:srw/diagnostic/1/27
> >>Meaning:
> >>Details:srw.dd
> >>Message:The index [srw.dd] did not include a searchable
> >> value
> >>
> >>I suppose the only option left is to retrieve everything and
> >> filter the results on the client side.
> >>
> >>Thanks for your quick reply.
> >>Arash
> >>
> >>
> >>-Original Message-
> >>From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
>

[CODE4LIB] FW: Job Posting (Library Technician) South Bay (Los Angeles County)

2012-05-22 Thread Suzanne Richards

Apologies for the cross postings . . .
LAC Group seeks a Library Technician for a part-time temporary 3-month position 
at a corporate library in the South Bay (Los Angeles County). This position 
reports to the library's Technical Services Manager.

Responsibilities:

  *   Add online access to company reports in the library catalog to the 
company's Knowledge Management System
  *   Load, define metadata, and add links to newly scanned 
internally-generated technical reports
  *   File maintenance in the digital library
  *   Descriptive cataloging
  *   Database clean-up
Qualifications:

  *   Previous technical library experience
  *   Excellent attention to detail
  *   Experience using an integrated library system
  *   Experience using a corporate document management system
To apply, please visit http://goo.gl/wVK4v

LAC Group is an Equal Opportunity / Affirmative Action Employer who values 
diversity in the workplace.

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

Re: [CODE4LIB] archiving a wiki

Re: [CODE4LIB] archiving a wiki

[CODE4LIB] archiving a wiki

[CODE4LIB] Job: NC LIVE Web & User Experience Development Librarian at NC LIVE

Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set

[CODE4LIB] FW: Job Posting (Library Technician) South Bay (Los Angeles County)

7 matches

Site Navigation

Mail list logo

Footer information