[CODE4LIB] Job: Digital Humanist at University of Cologne

2013-10-16 Thread jobs
The Cologne Center for eHumanities (CCeH) at the University of Cologneis seeking a Digital Humanist for a position as research associate(50%) at the earliest date possible, initially for a period of 14months. Applicants should ideally possess most of the following skills andcompetences or, if

[CODE4LIB] Hackathon in Philly, January 24th

2013-10-16 Thread Chris Strauber
Hey all, I'd like to invite you to a hackathon in Philadelphia on January 24th. It's being sponsored by ALA's Library Code Year IG (with help from LITA and ALCTS) at the Penn Special Collections center. We'll be hacking on several Worldcat APIs and the DPLA API, and there will be coders from both

Re: [CODE4LIB] local APIs atop III's Sierra DB

2013-10-16 Thread Dave Menninger
I've written a decent amount of code against Sierra, but I don't know if any of it amounts to an API. * I've built some utility code that has grown into a handful Perl modules that I use regularly in creating new reports. Most of these are special-purpose for applications we have in-house but

[CODE4LIB] Job: PHP developer at Comperio srl

2013-10-16 Thread jobs
**Who we are:** Comperio srl is a company focuses its work to the world of libraries, implementing IT solutions. Since 2004 develops [ClavisNG](http://www.comperio.it/solutions/clavisng-en-US/an-open-source-ils- for-libraries-networks/), a web-based Integrated Library System for library

[CODE4LIB] NISO Releases Draft Recommended Practice on Indexed Discovery Service for Comments

2013-10-16 Thread Ken Varnum
Of possible interest to this group. I was one of the members of the NISO ODI group that put together this draft recommendation. Your comments are welcome at http://www.niso.org/publications/rp/rp-19-201x Ken Varnum For release: 16 Oct 2013NISO Releases Draft Recommended Practice on Indexed

Re: [CODE4LIB] local APIs atop III's Sierra DB

2013-10-16 Thread Thomale, Jason
Everyone: You guys are fantastic. Thanks to those who have responded thus far for being so willing to share. I will be contacting y'all off-list, if you don't mind. :-) Just wanted to tag onto Dave's response here... I've written a decent amount of code against Sierra, but I don't know if

Re: [CODE4LIB] pdf2txt

2013-10-16 Thread Robert Haschart
On 10/15/2013 12:25 PM, Eric Lease Morgan wrote: On Oct 14, 2013, at 4:49 PM, Robert Haschartrh...@virginia.edu wrote: For a limited period of time I am making publicly available a Web-based program called PDF2TXT --http://bit.ly/1bJRyh8 Although based on some subsequent messages where you

[CODE4LIB] ALCTS Metadata Interest Group - Call for Proposals at ALA Midwinter 2014

2013-10-16 Thread Glendon, Ivey (img7u)
The ALCTS Metadata Interest Group invites speakers to present at the ALA Midwinter meeting in Philadelphia on Sunday, January 26, 2014 from 8:30 to 10am. Presentations will be approximately 30 minutes, including QA. Our charge is to provide a broad framework for information exchange on current

Re: [CODE4LIB] pdf2txt

2013-10-16 Thread Kevin Hawkins
On 10/15/13 11:45 AM, Eric Lease Morgan wrote: On Oct 14, 2013, at 7:56 AM, Nicolas Franck nicolas.fra...@ugent.be wrote: Could this also be done by Apache Tika? Or do I miss a crucial point? http://tika.apache.org/1.4/gettingstarted.html Nicolas, this looks VERY promising! It seemingly

[CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Anybody have data for the average length of specific MARC fields in some reasonably representative database? I mainly need 100, 245, 6xx. Thanks, kc -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Sean Hannan
That sounds like a request for Roy to fire up the ole OCLC Hadoop. -Sean On 10/16/13 1:06 PM, Karen Coyle li...@kcoyle.net wrote: Anybody have data for the average length of specific MARC fields in some reasonably representative database? I mainly need 100, 245, 6xx. Thanks, kc -- Karen

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
I'm running it against the HathiTrust catalog right now. It'll just take a while, given that I don't have access to Roy's Hadoop cluster :-) On Wed, Oct 16, 2013 at 1:38 PM, Sean Hannan shan...@jhu.edu wrote: That sounds like a request for Roy to fire up the ole OCLC Hadoop. -Sean On

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Roy Tennant
I don't even have to fire it up. That's a statistic that we generate quarterly (albeit via Hadoop). Here you go: 100 - 30.3 245 - 103.1 600 - 41 610 - 48.8 611 - 61.4 630 - 40.8 648 - 23.8 650 - 35.1 651 - 39.6 653 - 33.3 654 - 38.1 655 - 22.5 656 - 30.6 657 - 27.4 658 - 30.7 662 - 41.7 Roy On

Re: [CODE4LIB] local APIs atop III's Sierra DB

2013-10-16 Thread Rob Casson
i've done some very ugly, preliminary hacking at getting MARC records out: https://gist.github.com/roblivian/7012077 generally works, but still need to account for more invalid MARC tags, on-the-fly records (non-MARC records, i.e. reserve items, ordered bibs, etc) On Wed, Oct 16, 2013 at

[CODE4LIB] Tool for feedback on document

2013-10-16 Thread Walker, David
Hi all, We're looking to put together a large policy document, and would like to be able to solicit feedback on the text from librarians and staff across two dozen institutions. We could just do that via email, of course. But I thought it might be better to have something web-based. A wiki

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
This squares with what I'm seeing. Data for all holdings of the Orbis Cascade Alliance is: 100: 30.1 245: 114.1 6XX: 36.1 My values include indicators (2 characters) as well as any delimiters but not the tag number itself. I breaking up 6XX up as Roy has as 6XX's are far from created equal and

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
Argh. Must learn to write at third grade level I wanted to say I like breaking up 6XX as Roy has done because 6XX fields vary in purpose and tag frequency varies considerably. On Wed, Oct 16, 2013 at 11:08 AM, Kyle Banerjee kyle.baner...@gmail.comwrote: This squares with what I'm seeing.

Re: [CODE4LIB] Tool for feedback on document

2013-10-16 Thread Michael J. Giarlo
​Hi David, Google Drive (née Docs) will allow you to share your document with other users so that they can view and comment (and not edit), FWIW. There may be more elegant solutions that allow, say, nested/threaded comments. I know there is blog software out there that does this, but it's been

Re: [CODE4LIB] Tool for feedback on document

2013-10-16 Thread Ken Varnum
Commentpress and digress.it are two Wordpress variants that offer paragraph-by-paragraph threaded commenting. Commentpress is quite old (we used it here: http://www.lib.umich.edu/islamic/ in a collaborative cataloging project sponsored by CLIR and funded by Mellon). -- Ken Varnum | Web Systems

Re: [CODE4LIB] Tool for feedback on document

2013-10-16 Thread Mark A. Matienzo
Hi David, In the past, I've used Digress.it http://digress.it/ with WordPress for this - I've set this up for the Society of American Archivists Reappraisal and Deaccessioning Development and Review Team: http://rddrt.forens.es/. Mark -- Mark A. Matienzo m...@matienzo.org Digital Archivist,

Re: [CODE4LIB] Tool for feedback on document

2013-10-16 Thread Erik Hetzner
At Wed, 16 Oct 2013 11:06:02 -0700, Walker, David wrote: Hi all, We're looking to put together a large policy document, and would like to be able to solicit feedback on the text from librarians and staff across two dozen institutions. We could just do that via email, of course. But I

Re: [CODE4LIB] Tool for feedback on document

2013-10-16 Thread McCanna, Terran
I've used http://a.nnotate.com/ for this several times. You can leave comments in line with the text, respond to other comments, display/print the comments in different ways, and one of my favorite things is that the people you send the link to don't have to create an account. Terran McCanna

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Thanks, Roy (and others!) It looks like the 245 is including the $c - dang! I should have been more specific. I'm mainly interested in the title, which is $a $b -- I'm looking at the gains and losses of bytes should one implement FRBR. As a hedge, could I ask what've you got for the 240? that

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
245 not including $c, indicators, or delimiters, |h (which occurs before |b), |n, |p, with trailing slash preceding |c stripped for about 9 million records for Orbis Cascade collections is 70.1 kyle On Wed, Oct 16, 2013 at 12:00 PM, Karen Coyle li...@kcoyle.net wrote: Thanks, Roy (and

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Nicolas Franck
Are you familiar with OAI-PMH protocol? We have almost 2 miljoen records available over this protocol: http://search.ugent.be/meercat/x/oai?verb=ListRecordsmetadataPrefix=marcxml From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coyle

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
BTW, I don't think 240 is a good substitute as the content is very different than in the regular title. That's where you'll find music, laws, selections, translations and it's totally littered with subfields. The 70.1 figure from the stripped 245 is probably closer to the mark IMO, what you stand

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
On 10/16/13 12:33 PM, Kyle Banerjee wrote: BTW, I don't think 240 is a good substitute as the content is very different than in the regular title. That's where you'll find music, laws, selections, translations and it's totally littered with subfields. The 70.1 figure from the stripped 245 is

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Bill Dueber
For the HathiTrust catalog's 6,046,746 bibs and looking at only the lengths of the subfields $a and $b in 245s, I get an average length of 62.0 On Wed, Oct 16, 2013 at 3:24 PM, Kyle Banerjee kyle.baner...@gmail.comwrote: 245 not including $c, indicators, or delimiters, |h (which occurs before

Re: [CODE4LIB] local APIs atop III's Sierra DB

2013-10-16 Thread Joshua Welker
Thought I'd share this work put together by the folks in charge of our consortium: https://github.com/mcoia/sierra_marc_tools It's a Perl implementation. I haven't used it myself, but I know it can generate MARC records. Josh Welker -Original Message- From: Code for Libraries

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
Yes, that's my take as well, but I think it's worth quantifying if possible. There is the usual trade-off between time and space -- and I'd be interested in hearing whether anyone here thinks that there is any concern about traversing the WEM structure for each search and display. Does it

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Kyle Banerjee
Depends on how many requests the service has to accommodate. Up to a point, it's no big deal. After a certain point, servicing lots of calls gets expensive and bang for the buck is brought into question. My bigger concern would be getting data encoded/structured consistently. Even though FRBR has

Re: [CODE4LIB] MARC field lengths

2013-10-16 Thread Karen Coyle
On 10/16/13 4:22 PM, Kyle Banerjee wrote: In some ways, FRBR strikes me as the catalogers' answer to the miserable seven layer OSI model which often confuses rather than clarifies -- largely because it doesn't reflect reality very well. Agreed. I am having trouble seeing FRBR as being