[CODE4LIB] Job: Head, Digital Scholarship (Search Extended) at Clemson University
Clemson University Libraries seeks an innovative and motivated professional to work with a vibrant library faculty and staff to envision and implement a digital scholarship initiative that creatively engages all members of the campus community. The Head of Digital Scholarship, reporting to the Head of the Office of Library Technology, will play a key leadership role in shaping the creation, delivery, and preservation of original digital scholarship produced at Clemson University, with specific responsibilities for scholarly communications, rights management, and digital production. Incumbent will advocate for digital scholarship initiatives at Clemson, such as open-access publishing and the institutional repository, will raise awareness at Clemson about the emerging trends in scholarly communications and their impact on the university, and serve as a resource on intellectual property concerns. The Head of Digital Scholarship will also supervise the production of unique digital material and metadata at Clemson's Digital Imaging Lab. This is a 12-month tenure-track position with faculty rank and status. Job Responsibilities Scholarly Communications (50%) Serves as an advocate for new practices in scholarly communications at the University. Assesses faculty and student scholarly communication needs and makes recommendations for funding and support in the library. Advises library on issues related to intellectual property, open access publishing, and fair use. Facilitates the deposit of faculty scholarly output into the University's institutional repository. Provides guidance to library staff regarding scholarly communication and coordinates with subject specialists to broaden their understanding of scholarly communication. Rights Management (20%) Advises library on issues related to copyright. Maintains an awareness of copyright and fair use legal interpretations in higher education and communicates to faculty and students. Creates digital publishing and copyright information resources and workshops for the campus community. Provides guidance on rights management as it relates to digital projects. Represents the interests of the University Libraries in the development of University policy related to copyright and user privacy issues. Digital Production and Metadata (20%) Works with the Digital Projects Manager to identify new collections for digitization and facilitate working relationships with partners. Supervises the integration of metadata across a variety of library applications, following standards and best practices for the description of digital objects. Facilitates the use of digitized material in digital humanities projects and exhibitions. Participates in grant development and ensures compliance with current grant commitments. Helps to ensure that assessment plans are developed as part of any new projects. Library and University Affairs (10%) Remains current with advances in information technology, scholarly communications, and rights management and the impact of those advances on libraries and digital scholarship. Serves on library, university, and professional committees, elected and assigned. Undertakes research and/or professional development related to professional and scholarly interests. Serves as part of the leadership team that develops policies and standards within the Office of Library Technology. Required Qualifications MLIS or equivalent degree from an ALA-accredited school or university. Familiarity or some experience with issues related to scholarly communications, rights management, and digital production/metadata description. Experience with library technologies and applications. Experience with project management. Evidence of, or potential for, professional and/or scholarly activity. Preferred Strong background in rights management or scholarly communications. Strong background in digitization/metadata best practices and digital project workflows. Direct experience in scholarly communications or copyright field. Ability to collaborate with diverse groups and communicate ideas effectively. Supervisory experience. Position offers a highly competitive salary and faculty rank based on the successful candidate's demonstrated qualifications and experience. About Clemson Clemson University is a major, land-grant, science and engineering-oriented research university in a college-town setting along a dynamic Southeastern corridor. Ranked as one of America's Top Universities by U.S. News World Report, Clemson is an inclusive, student-centered community characterized by high academic standards, culture of collaboration, school spirit, and competitive drive to excel. Nestled in the beautiful foothills of the Blue Ridge mountains, Clemson is located in the fastest-growing area of South Carolina and a short drive to major destination cities
[CODE4LIB] Avanti Nova version 0.3
Avanti Nova version 0.3 has been released. This version focuses on further developing the Nova scripting language among other things, including running Nova script files. I have also implemented the concept of object sets which may have many possibilities down the road. Right now Nova is still a prototype and an idea for a system that manages linked data. There is still a lot of boilerplate underneath, as I am further developing and fleshing out the scripting language. I am looking into the ability to embed Nova code in documents or HTML files. Also thinking about flow control methods. Soon I plan to replace the boilerplate with an interface to a database system (mySQL?) that can scale for real applications, at which point the practical uses for Nova will become more apparent. For more information and to download the software go to http://www.avantilibrarysystems.com Peter Schlumpf Avanti Library Systems pschlu...@gmail.com
[CODE4LIB] Fwd: HydraCamp 2013 April 8-12, hosted by TCD and DRI
Code4Lib'bers who make it to HydraCamp should know that I have a guest room 2.5 hours down the road in Galway. :) -Jodi Begin forwarded message: From: Jimmy Tang jt...@tchpc.tcd.ie Subject: HydraCamp 2013 April 8-12, hosted by TCD and DRI Date: 27 February 2013 12:20:53 GMT To: dri-stran...@listserv.heanet.ie Reply-To: Jimmy Tang jt...@tchpc.tcd.ie Dear Colleagues, Trinity College Dublin and the Digital Repository of Ireland host HydraCamp 2013 Trinity College Dublin, as part of the Digital Repository of Ireland, will be hosting HydraCamp 2013, on April 8th -12th, 2013. The Digital Repository of Ireland (DRI) is the national digital repository for the humanities and social sciences and uses the Hydra framework. This will be a week-long training course aimed at developers who are building, or are interested in building, repositories for preservation and archiving. This training course will be delivered by Data Curation Experts. DRI and TCD are bringing HydraCamp to Europe to provide developers and engineers with an opportunity to familiarise themselves with the Hydra framework. Participants will be introduced to elements of agile development processes, Ruby on Rails and the Hydra stack. Developers planning to use Fedora for archiving and preservation needs will benefit from the comprehensive Hydra training provided by the camp. Hydra is a repository solution that is being used by institutions on both sides of the North Atlantic to provide access to their digital content. Hydra provides a versatile and feature-rich environment for end-users and repository administrators alike. Hydra is a community, a technical framework and an open source software solution. Information about the camp will be released regularly on the camp’s website at: https://www.tchpc.tcd.ie/hydracamp2013 Please feel free to circulate the details above, and if you have any queries or questions please contact hydrac...@tchpc.tcd.ie Resources: HydraCamp 2013 - https://www.tchpc.tcd.ie/hydracamp2013 Digital Repository of Ireland - http://www.dri.ie/ Project Hydra - http://projecthydra.org/ Data Curation Experts - http://curationexperts.wordpress.com/ Trinity College Dublin - http://www.tcd.ie/ Kind regards, Jimmy Tang -- Senior Software Engineer, Digital Repository of Ireland (DRI) High Performance Research Computing, IS Services Lloyd Building, Trinity College Dublin, Dublin 2, Ireland. http://www.tchpc.tcd.ie/ | jt...@tchpc.tcd.ie Tel: +353-1-896-3847
Re: [CODE4LIB] back to minorities question, seeking guidance
I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101
Re: [CODE4LIB] back to minorities question, seeking guidance
What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101
[CODE4LIB] Math or the other math?
You mean discrete mathematics? http://en.wikipedia.org/wiki/Discrete_mathematics I always kicked myself for not taking that course at high school (UK readers, I mean secondary school) but at least I picked up the basics during my physics MSci (a lot of physics these days is coding). Cheers, m -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken Irwin Sent: 27 February 2013 13:53 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101
Re: [CODE4LIB] Math or the other math?
+1 mostly to the thread Programming seems to me -- just me here -- stratified like any other profession, in particular by access or lack of access to computer science within software dev. There are other factors. But computer science seems now heavily invested in math. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 9:17 AM, Michael Hopwood mich...@editeur.org wrote: You mean discrete mathematics? http://en.wikipedia.org/wiki/Discrete_mathematics I always kicked myself for not taking that course at high school (UK readers, I mean secondary school) but at least I picked up the basics during my physics MSci (a lot of physics these days is coding). Cheers, m -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken Irwin Sent: 27 February 2013 13:53 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Math or the other math?
From a physics point of view, computer science looks about 50% discrete math, and 50% engineering (since computers, fancy as they may be, are simply machines, and have specific physical constraints that it may be helpful to understand). Actual coding nowadays, I assume, may sometimes actually have a lot in common with language arts (UK readers: we don't study language arts. Sorry...) but it's worth noting that logic is the common factor between language arts and math. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Al Matthews Sent: 27 February 2013 14:28 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Math or the other math? +1 mostly to the thread Programming seems to me -- just me here -- stratified like any other profession, in particular by access or lack of access to computer science within software dev. There are other factors. But computer science seems now heavily invested in math. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 9:17 AM, Michael Hopwood mich...@editeur.org wrote: You mean discrete mathematics? http://en.wikipedia.org/wiki/Discrete_mathematics I always kicked myself for not taking that course at high school (UK readers, I mean secondary school) but at least I picked up the basics during my physics MSci (a lot of physics these days is coding). Cheers, m -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken Irwin Sent: 27 February 2013 13:53 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Math or the other math?
As a math person who later studied some grad-level computer science, my personal experience was that the stuff I found easy in CS was exactly what the CS students got hung up on. So the aspects of CS involving higher math (Hi Cary!) can definitely be challenging for those who don't already have a math background; and contrariwise having that background can make grad-level CS stuff go much easier. (This statement applies more to theoretical CS, but I think trickles down a bit to coding as well.) The extent to which this applies to the more engineering-y aspects of programming isn't clear, but I feel like I called on my basic math understanding a lot when I was learning to code. Knowledge of boolean algebra and set theory was definitely helpful in learning SQL, for instance, if only to provide me with a language I was already familiar with and in which I could frame otherwise new concepts related to querying. I think if there's one thing that a genuine math background gives a coder, it's a vocabulary and a conceptual framework that they can apply to the concepts from programming to make them more familiar. The quantitative reasoning aspect is big too, of course, and that tends to come with the study of math; but I think there are other places it can be got (for instance, philosophical logic [1], rhetoric, hard engineering disciplines, the natural sciences, some of the social sciences). [1] ...which is just different enough from mathematical logic to be a bit Alice-in-Wonderlandy for us math types. On Wed, Feb 27, 2013 at 9:27 AM, Al Matthews amatth...@auctr.edu wrote: +1 mostly to the thread Programming seems to me -- just me here -- stratified like any other profession, in particular by access or lack of access to computer science within software dev. There are other factors. But computer science seems now heavily invested in math. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 9:17 AM, Michael Hopwood mich...@editeur.org wrote: You mean discrete mathematics? http://en.wikipedia.org/wiki/Discrete_mathematics I always kicked myself for not taking that course at high school (UK readers, I mean secondary school) but at least I picked up the basics during my physics MSci (a lot of physics these days is coding). Cheers, m -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken Irwin Sent: 27 February 2013 13:53 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] back to minorities question, seeking guidance
Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. HTML is called markup language, but does anyone here really think it's a programming language? Even though is gets more complicated over time, it pretty much doesn't have variables or do interactive things, and is for displaying things, not manipulating things. My point about math and programming is that the curriculum for the average intro programming class appears to have been developed circa 1972 and never tweaked. I'm in Programming for Engineers right now, which is the prerequisite for the classes that looked useful. So far we have written lots of small programs to add numbers, find modulos, make a simple loop. All this would have been exciting before calculators. But, yeah, we have calculators now. And, actually, we had calculators before we had widespread access to affordable computers. Writing a page long program to add some numbers makes no sense. It's probably the least efficient way to solve the problem. Nothing about the coursework shows computers as useful at solving problems. Everything about the coursework shows computers as clunky inefficient, difficult to use calculators. And... here is something we haven't done... We have not yet called a function from inside a function. So, the whole object oriented thing has not yet appeared, and it's past midterm time. From having looked at a bunch of syllabi online for different intro level programming classes, I think my experiences are the norm. The intro classes cover things you can do more easily without coding. This type of curriculum is off putting to at least some people. It also isn't necessary. I think it's possible to design a curriculum where students could have something to show that would be worthwhile now, as opposed to worthwhile in 1972 when adding many numbers at once was a big deal. -Wilhelmina Randtke On Sat, Feb 23, 2013 at 1:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel
[CODE4LIB] githubs for poetry, legal docs
Given the discussion of how github is not really so accessible to non-coders, I thought I'd mention these attempts to put version control into the mainstream. Github for writers: It sounds like that's what Blaine Cook is doing with Poetica.com Github for legal agreements: We've started using Docracy.com to help us manage legal agreements. Eric Eric Hellman President, Gluejar.Inc. Founder, Unglue.it https://unglue.it/ http://go-to-hellman.blogspot.com/ twitter: @gluejar
Re: [CODE4LIB] back to minorities question, seeking guidance
I think Wilhelmina has touched on an very important point that, for some, in order to learn--or want to learn--something, the material has to be relevant to them. Some folks can get through the boring, calculators can do this parts of because they anticipate the long-term benefit while others learn more effectively if the material helps them achieve a goal they already have or a goal that is within their area of expertise or interest. Christina George (Hi! I'm new to this listserv) -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Wilhelmina Randtke Sent: Wednesday, February 27, 2013 8:47 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. HTML is called markup language, but does anyone here really think it's a programming language? Even though is gets more complicated over time, it pretty much doesn't have variables or do interactive things, and is for displaying things, not manipulating things. My point about math and programming is that the curriculum for the average intro programming class appears to have been developed circa 1972 and never tweaked. I'm in Programming for Engineers right now, which is the prerequisite for the classes that looked useful. So far we have written lots of small programs to add numbers, find modulos, make a simple loop. All this would have been exciting before calculators. But, yeah, we have calculators now. And, actually, we had calculators before we had widespread access to affordable computers. Writing a page long program to add some numbers makes no sense. It's probably the least efficient way to solve the problem. Nothing about the coursework shows computers as useful at solving problems. Everything about the coursework shows computers as clunky inefficient, difficult to use calculators. And... here is something we haven't done... We have not yet called a function from inside a function. So, the whole object oriented thing has not yet appeared, and it's past midterm time. From having looked at a bunch of syllabi online for different intro level programming classes, I think my experiences are the norm. The intro classes cover things you can do more easily without coding. This type of curriculum is off putting to at least some people. It also isn't necessary. I think it's possible to design a curriculum where students could have something to show that would be worthwhile now, as opposed to worthwhile in 1972 when adding many numbers at once was a big deal. -Wilhelmina Randtke On Sat, Feb 23, 2013 at 1:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel
Re: [CODE4LIB] back to minorities question, seeking guidance
Christina George, hello! and welcome. WR, idly, I wonder whether this intro to programming but-not-for-programmers course might be taught by an underqualified or overworked adjunct or grad student slave, or if not, whether instead by a bored research professor. It doesn't sound like fun. Sympathy. Greetings to all 2292 recipients. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 11:11 AM, George, Christina Rose georg...@umsystem.edu wrote: I think Wilhelmina has touched on an very important point that, for some, in order to learn--or want to learn--something, the material has to be relevant to them. Some folks can get through the boring, calculators can do this parts of because they anticipate the long-term benefit while others learn more effectively if the material helps them achieve a goal they already have or a goal that is within their area of expertise or interest. Christina George (Hi! I'm new to this listserv) -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Wilhelmina Randtke Sent: Wednesday, February 27, 2013 8:47 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. HTML is called markup language, but does anyone here really think it's a programming language? Even though is gets more complicated over time, it pretty much doesn't have variables or do interactive things, and is for displaying things, not manipulating things. My point about math and programming is that the curriculum for the average intro programming class appears to have been developed circa 1972 and never tweaked. I'm in Programming for Engineers right now, which is the prerequisite for the classes that looked useful. So far we have written lots of small programs to add numbers, find modulos, make a simple loop. All this would have been exciting before calculators. But, yeah, we have calculators now. And, actually, we had calculators before we had widespread access to affordable computers. Writing a page long program to add some numbers makes no sense. It's probably the least efficient way to solve the problem. Nothing about the coursework shows computers as useful at solving problems. Everything about the coursework shows computers as clunky inefficient, difficult to use calculators. And... here is something we haven't done... We have not yet called a function from inside a function. So, the whole object oriented thing has not yet appeared, and it's past midterm time. From having looked at a bunch of syllabi online for different intro level programming classes, I think my experiences are the norm. The intro classes cover things you can do more easily without coding. This type of curriculum is off putting to at least some people. It also isn't necessary. I think it's possible to design a curriculum where students could have something to show that would be worthwhile now, as opposed to worthwhile in 1972 when adding many numbers at once was a big deal. -Wilhelmina Randtke On Sat, Feb 23, 2013 at 1:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
[CODE4LIB] Slicing/dicing/combining large amounts of data efficiently
I'm involved in a migration project that requires identification of local information in millions of MARC records. The master records I need to compare with are 14GB total. I don't know what the others will be, but since the masters are deduped and the source files aren't (plus they contain loads of other garbage), there will be considerably more. Roughly speaking, if I compare 1000 master records per second, it would take about 2 1/2 hours to cut through the file. I need to be able to ask the file whatever questions the librarians might have (i.e. many), so speed is important. For reasons I won't go into right now, I'm stuck doing this on my laptop in cygwin right now and that affects my range of motion. I'm trying to figure out the best way to proceed. Currently, I'm extracting specific fields for comparison. Each field tag gets a single line keyed by OCLC number (repeated fields are catted together with a delimiter). The idea is that if I deal with only one field at a time, I can slurp the master info in memory and retrieve it via hash (OCLC control number) as I loop through the comparison data. Local data will either be stored in special files that are loaded separately from the bibs or recorded in reports for maintenance projects This process is clunky because a special comparison file has to be created for each question, but it does seem to work (generating preprocess files and then doing the compare is measured in minutes rather than hours). I didn't use a DB because there's no way I could store the reference data in memory and I figured I'd just thrash my drive. Is this a reasonable approach, and whether or not it is, what tools should I be thinking of using for this? Thanks, kyle
[CODE4LIB] Job: Data Database Administrator at Yale University Art Gallery
Reporting to the Director of Information Technology at the Yale University Art Gallery, the Data and Database Specialist develops, implements, and maintains policies and procedures for ensuring the security and integrity of the Gallery's collection database while actively ensuring adherence to data standards across the Gallery environment. Using highly developed SQL skills, implements data models, database designs, resolves database performance and capacity issues. Serves as primary resource for a wide-range of museum staff members and departments that use TMS (The Museum System), the collections management database used by the Gallery. Manages the technical and procedural responsibilities to advance the strategic plan for development and future use, and support the ongoing use and maintenance of TMS. Principal Responsibilities 1. Coordinates the planning and development of new, complex relational databases. 2. Ensures adherence to data standards across the University environment. 3. Develops and documents database procedures and policies and resolves complex protocol deviations in existing databases. 4. Maintains and administers database systems to ensure effective implementation and use of databases including software testing, debugging, and data quality assurance projects. 5. Implements technical solutions including installation, configuration, and resolution of issues with multiple layers of products and technologies. 6. Works directly with University departments to troubleshoot problems, create reports, identify and resolve technical issues. Communicates directly with staff to understand goals and solve user problems. 7. Documents end user's experience, and develops methodologies and assessment techniques that address usability goals. 8. Provides training on database, appropriate use and facilitates user groups. 9. Performs various coding, debugging and unit testing tasks in support of assigned projects. 10. Assists in evaluating University business and administrative processes and needs and develops solutions to technical problems. 11. Assists in developing and implementing database security procedures, including access authorization, logins, and permissions. 12. Applies current programming standards and methodologies to all relevant projects and activities. 13. May perform other duties as assigned. Required Education and Experience: Bachelor's Degree in a related field and five years of related work experience or an equivalent combination of education and experience. Required Skill/Ability 1: Excellent interpersonal skills, with the ability to work independently as well as a member of a team. Required Skill/Ability 2: Advanced knowledge and proven ability in database management (Microsoft SQL and SQL based tools). Required Skill/Ability 3: Knowledge of data standards and formats for description, presentation, and transmission. Required Skill/Ability 4: Ability to manage and prioritize multiple projects simultaneously. Preferred Education, Experience and Skills: TMS (Gallery Systems) experience and Crystal Reports or similar report writing tool; Experience working in Museum setting; Experience with HTML/XML and Java programming. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6511/
Re: [CODE4LIB] Slicing/dicing/combining large amounts of data efficiently
Kyle -- if this was me -- I'd break the file into a database. You have a lot of different options, but the last time I had to do something like this -- I broke the data into 10 tables -- a control table with a primary key and oclc number, a table for 0xx fields, a table for 1xx, 2xx, etc. including OCLC number and key that they relate too. You can actually do this with MarcEdit (if you have mysql installed) -- but on a laptop -- I'm not going to guarantee speed with the process. Plus, the process to generate the SQL data will be significant. It might take 15 hours to generate the database, but then you'd have it and could create indexes on it. But you could use it to create the database and then prep the files for later work. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kyle Banerjee Sent: Wednesday, February 27, 2013 9:45 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Slicing/dicing/combining large amounts of data efficiently I'm involved in a migration project that requires identification of local information in millions of MARC records. The master records I need to compare with are 14GB total. I don't know what the others will be, but since the masters are deduped and the source files aren't (plus they contain loads of other garbage), there will be considerably more. Roughly speaking, if I compare 1000 master records per second, it would take about 2 1/2 hours to cut through the file. I need to be able to ask the file whatever questions the librarians might have (i.e. many), so speed is important. For reasons I won't go into right now, I'm stuck doing this on my laptop in cygwin right now and that affects my range of motion. I'm trying to figure out the best way to proceed. Currently, I'm extracting specific fields for comparison. Each field tag gets a single line keyed by OCLC number (repeated fields are catted together with a delimiter). The idea is that if I deal with only one field at a time, I can slurp the master info in memory and retrieve it via hash (OCLC control number) as I loop through the comparison data. Local data will either be stored in special files that are loaded separately from the bibs or recorded in reports for maintenance projects This process is clunky because a special comparison file has to be created for each question, but it does seem to work (generating preprocess files and then doing the compare is measured in minutes rather than hours). I didn't use a DB because there's no way I could store the reference data in memory and I figured I'd just thrash my drive. Is this a reasonable approach, and whether or not it is, what tools should I be thinking of using for this? Thanks, kyle
Re: [CODE4LIB] back to minorities question, seeking guidance
OMG. I used to tell everyone that arithmetic is not math. Amazingly nobody (who is not into math) cares. Just ask my wife. Cary On Wed, Feb 27, 2013 at 5:43 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 -- Cary Gordon The Cherry Hill Company http://chillco.com
Re: [CODE4LIB] back to minorities question, seeking guidance
I think that the programming / scripting / markup language discussion is not helpful. Any time you key in something, run it on a computer, and something else comes out (hopefully what is expected), to me, that qualifies as programming. Why not? Cary On Wed, Feb 27, 2013 at 6:47 AM, Wilhelmina Randtke rand...@gmail.comwrote: Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. HTML is called markup language, but does anyone here really think it's a programming language? Even though is gets more complicated over time, it pretty much doesn't have variables or do interactive things, and is for displaying things, not manipulating things. My point about math and programming is that the curriculum for the average intro programming class appears to have been developed circa 1972 and never tweaked. I'm in Programming for Engineers right now, which is the prerequisite for the classes that looked useful. So far we have written lots of small programs to add numbers, find modulos, make a simple loop. All this would have been exciting before calculators. But, yeah, we have calculators now. And, actually, we had calculators before we had widespread access to affordable computers. Writing a page long program to add some numbers makes no sense. It's probably the least efficient way to solve the problem. Nothing about the coursework shows computers as useful at solving problems. Everything about the coursework shows computers as clunky inefficient, difficult to use calculators. And... here is something we haven't done... We have not yet called a function from inside a function. So, the whole object oriented thing has not yet appeared, and it's past midterm time. From having looked at a bunch of syllabi online for different intro level programming classes, I think my experiences are the norm. The intro classes cover things you can do more easily without coding. This type of curriculum is off putting to at least some people. It also isn't necessary. I think it's possible to design a curriculum where students could have something to show that would be worthwhile now, as opposed to worthwhile in 1972 when adding many numbers at once was a big deal. -Wilhelmina Randtke On Sat, Feb 23, 2013 at 1:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Cary Gordon The Cherry Hill Company http://chillco.com
[CODE4LIB] Adding Twitter to Google Analytics
Hi Everyone, I am running a wordpress-powered web site and use Google Analytics to measure our traffic. I would like to set up Analytics so that it measures traffic coming from Twitter and Facebook. I see that Google Analytics can be set up to track social interactions, but I'm feeling overwhelmed by the instructions on how to set it up (https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSocial?hl=en_GB). Have any of you successfully set this up? Care to share your tips? It would be wonderful if there was a simple edit to the WP code, or a plugin - am I dreaming? Many thanks, Kim - Kimberly Silk, MLS Data Librarian, Martin Prosperity Institute Rotman School of Management at the University of Toronto 105 St. George Street, Suite 9000 Toronto, ON M5S 3E6 President, SLA Toronto Chapter Office: 416-946-7032 -- New!!! Mobile: 416-721-8955 kimberly.s...@martinprosperity.orgmailto:kimberly.s...@martinprosperity.org @kimberlysilk www.martinprosperity.org Twitter: @MartinProsperit
[CODE4LIB] Management System to Digital Preservation
http://www.usp.br/sibi http://www.usp.br/ *Sorry for cross-posting* Dear Colleagues, We are interested in to know which software/system are you using to manage the digital preservation of your digitized contents? And if you're satisfied with it. Here at University of Sao Paulo (Brazil) we have a huge digitization project and we are studying the different options of management systems (Ex Libris Rosetta, EMC Documentum, LOCKSS etc.). Thanks for all contributions. Sincerely, Anderson de Santana Technical Department Libraries Integration Service University of Sao Paulo http://www.usp.br/sibi E-mail: algal...@usp.br Fone: (5511) 3091-4439 *Skype*: andesantana
Re: [CODE4LIB] back to minorities question, seeking guidance
I'm forced to agree that arithmetic isn't math. In fact, I'd go further and say that arithmetic isn't even arithmetic. At best it's accounting. (Accounting, on the other hand, is way more than accounting, so please don't take offense if you're an accountant.) On Wed, Feb 27, 2013 at 12:57 PM, Cary Gordon listu...@chillco.com wrote: OMG. I used to tell everyone that arithmetic is not math. Amazingly nobody (who is not into math) cares. Just ask my wife. Cary On Wed, Feb 27, 2013 at 5:43 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 -- Cary Gordon The Cherry Hill Company http://chillco.com
Re: [CODE4LIB] back to minorities question, seeking guidance
Salve! I'm forced to agree that arithmetic isn't math. In fact, I'd go further and say that arithmetic isn't even arithmetic. At best it's accounting. (Accounting, on the other hand, is way more than accounting, so please don't take offense if you're an accountant.) http://xkcd.com/899/ That is all. Cheers, Brooke
[CODE4LIB] Imaging Hosting Services
Hello All, We are considering an image host for a special collection. The collection would be private and only viewable via links added to and searched through our online catalog (InMagic) Has anyone used either of the following and had a positive experience: - ImageShack - Flickr Pro or Nonprofits(Flickr/Yahoo) - WebLife Photo (Earthlink) These are the ones recommended to me thus far. We're looking to upload around 6 gig of files. I'd appreciate any insight any of you might have. Peace Desiree Yael Vester Caretaker, OPAC Coordinator Lesbian Herstory Archives http://lesbianherstoryarchives.org/
Re: [CODE4LIB] Imaging Hosting Services
We've only used archive.org for this type of service. We've had a good experience with them. Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x1581 www.sonomalibrary.org -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of DYV Sent: Wednesday, February 27, 2013 12:40 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Imaging Hosting Services Hello All, We are considering an image host for a special collection. The collection would be private and only viewable via links added to and searched through our online catalog (InMagic) Has anyone used either of the following and had a positive experience: - ImageShack - Flickr Pro or Nonprofits(Flickr/Yahoo) - WebLife Photo (Earthlink) These are the ones recommended to me thus far. We're looking to upload around 6 gig of files. I'd appreciate any insight any of you might have. Peace Desiree Yael Vester Caretaker, OPAC Coordinator Lesbian Herstory Archives http://lesbianherstoryarchives.org/
Re: [CODE4LIB] Adding Twitter to Google Analytics
You shouldn't have to do any setup for basic counts to show up. Traffic coming directly from Twitter or Facebook ought to be appearing under Traffic Sources - Social - Overview From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Kimberly Silk [kimberly.s...@rotman.utoronto.ca] Sent: Wednesday, February 27, 2013 11:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Adding Twitter to Google Analytics Hi Everyone, I am running a wordpress-powered web site and use Google Analytics to measure our traffic. I would like to set up Analytics so that it measures traffic coming from Twitter and Facebook. I see that Google Analytics can be set up to track social interactions, but I'm feeling overwhelmed by the instructions on how to set it up (https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSocial?hl=en_GB). Have any of you successfully set this up? Care to share your tips? It would be wonderful if there was a simple edit to the WP code, or a plugin - am I dreaming? Many thanks, Kim - Kimberly Silk, MLS Data Librarian, Martin Prosperity Institute Rotman School of Management at the University of Toronto 105 St. George Street, Suite 9000 Toronto, ON M5S 3E6 President, SLA Toronto Chapter Office: 416-946-7032 -- New!!! Mobile: 416-721-8955 kimberly.s...@martinprosperity.orgmailto:kimberly.s...@martinprosperity.org @kimberlysilk www.martinprosperity.org Twitter: @MartinProsperit
Re: [CODE4LIB] Slicing/dicing/combining large amounts of data efficiently
I agree with Terry: use a database. Since you're doing multiple queries, invest the time up front to import your data in a queryable format, with indexes, instead of repeatedly building comparison files. But of course, it depends... dealing with large amounts of data efficiently is often best done with lots of memory. But if you can run mysql and the lengthy up-front parsing/loading/indexing of the records is acceptable, go for it. For what it's worth, I have done something similar for many years, where I build a database with all of our MARC records, parsed down to the subfield level. It's great for queries like find me all the records with XYZ in one subfield and ABC in another or find all of the duplicate OCLC numbers. It's not so great if you need to output the original field in a report (though it can be rebuilt from the subfields). Here's the Oracle table I use: CREATE TABLE bib_subfield (record_id INT NOT NULL ,field_seq INT NOT NULL ,subfield_seq INT NOT NULL ,indicators CHAR(2) NULL ,tag CHAR(4) NOT NULL ,subfield NVARCHAR2(4000) NULL ) ; Our MARC data is Unicode, thus the NVARCHAR. Super-long subfields like some 5xx notes do get truncated but that's a tiny fraction of a percentage of data lost, a fair tradeoff for our needs. field_seq and subfield_seq are numbers tracking the ordinal position of each field within the record, and each subfield within a field, for those occasional queries wanting data from the first 650 field, or subfields which aren't in the correct order per catalogers. You may not need that level of detail. Another, completely unrelated, possible solution depending on your needs: run the records through solrmarc and do your queries via solr? Good luck... let us know what you eventually decide to do. --Andy On Wed, Feb 27, 2013 at 9:53 AM, Reese, Terry terry.re...@oregonstate.eduwrote: Kyle -- if this was me -- I'd break the file into a database. You have a lot of different options, but the last time I had to do something like this -- I broke the data into 10 tables -- a control table with a primary key and oclc number, a table for 0xx fields, a table for 1xx, 2xx, etc. including OCLC number and key that they relate too. You can actually do this with MarcEdit (if you have mysql installed) -- but on a laptop -- I'm not going to guarantee speed with the process. Plus, the process to generate the SQL data will be significant. It might take 15 hours to generate the database, but then you'd have it and could create indexes on it. But you could use it to create the database and then prep the files for later work. --TR
Re: [CODE4LIB] Adding Twitter to Google Analytics
Hi Kim, If I am reading your question correctly, just setting up an advanced/custom segment in Google Analytics that tracks traffic coming from fb and twitter should do the trick. Create an advanced segment, name it and then set it to include source then use Matching RegExp and fill in the form with: facebook|m.facebook.com|hootsuite.com|ow.ly|t.co|tweetdeck|twitter (or whatever you wantthat's just some of the sources I use...including ow.ly and hootsuite because we use hootsuite here at SI) You can then put that custom segment on your dashboard in a widget if you want. Or not. Hth, Keri Keri Thompson Head, Web Services Department Smithsonian Institution Libraries e. thomps...@si.edu t. 202.633.1716 @DigiKeri_SIL library.si.edu || blog.library.si.edu || biodiversitylibrary.org -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Genny Engel Sent: Wednesday, February 27, 2013 4:33 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Adding Twitter to Google Analytics You shouldn't have to do any setup for basic counts to show up. Traffic coming directly from Twitter or Facebook ought to be appearing under Traffic Sources - Social - Overview From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Kimberly Silk [kimberly.s...@rotman.utoronto.ca] Sent: Wednesday, February 27, 2013 11:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Adding Twitter to Google Analytics Hi Everyone, I am running a wordpress-powered web site and use Google Analytics to measure our traffic. I would like to set up Analytics so that it measures traffic coming from Twitter and Facebook. I see that Google Analytics can be set up to track social interactions, but I'm feeling overwhelmed by the instructions on how to set it up (https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSocial?hl=en_GB). Have any of you successfully set this up? Care to share your tips? It would be wonderful if there was a simple edit to the WP code, or a plugin - am I dreaming? Many thanks, Kim - Kimberly Silk, MLS Data Librarian, Martin Prosperity Institute Rotman School of Management at the University of Toronto 105 St. George Street, Suite 9000 Toronto, ON M5S 3E6 President, SLA Toronto Chapter Office: 416-946-7032 -- New!!! Mobile: 416-721-8955 kimberly.s...@martinprosperity.orgmailto:kimberly.s...@martinprosperity.org @kimberlysilk www.martinprosperity.org Twitter: @MartinProsperit
Re: [CODE4LIB] Slicing/dicing/combining large amounts of data efficiently
I'd also consider using a document db (e.g. MongoDb) with the marc-in-JSON format for this. You could run jsonpath queries or map/reduce to get your answers. Mongo runs best in memory, but I think you'll be fine since you don't need immediate answers. -Ross. On Wednesday, February 27, 2013, Andy Kohler wrote: I agree with Terry: use a database. Since you're doing multiple queries, invest the time up front to import your data in a queryable format, with indexes, instead of repeatedly building comparison files. But of course, it depends... dealing with large amounts of data efficiently is often best done with lots of memory. But if you can run mysql and the lengthy up-front parsing/loading/indexing of the records is acceptable, go for it. For what it's worth, I have done something similar for many years, where I build a database with all of our MARC records, parsed down to the subfield level. It's great for queries like find me all the records with XYZ in one subfield and ABC in another or find all of the duplicate OCLC numbers. It's not so great if you need to output the original field in a report (though it can be rebuilt from the subfields). Here's the Oracle table I use: CREATE TABLE bib_subfield (record_id INT NOT NULL ,field_seq INT NOT NULL ,subfield_seq INT NOT NULL ,indicators CHAR(2) NULL ,tag CHAR(4) NOT NULL ,subfield NVARCHAR2(4000) NULL ) ; Our MARC data is Unicode, thus the NVARCHAR. Super-long subfields like some 5xx notes do get truncated but that's a tiny fraction of a percentage of data lost, a fair tradeoff for our needs. field_seq and subfield_seq are numbers tracking the ordinal position of each field within the record, and each subfield within a field, for those occasional queries wanting data from the first 650 field, or subfields which aren't in the correct order per catalogers. You may not need that level of detail. Another, completely unrelated, possible solution depending on your needs: run the records through solrmarc and do your queries via solr? Good luck... let us know what you eventually decide to do. --Andy On Wed, Feb 27, 2013 at 9:53 AM, Reese, Terry terry.re...@oregonstate.edu javascript:;wrote: Kyle -- if this was me -- I'd break the file into a database. You have a lot of different options, but the last time I had to do something like this -- I broke the data into 10 tables -- a control table with a primary key and oclc number, a table for 0xx fields, a table for 1xx, 2xx, etc. including OCLC number and key that they relate too. You can actually do this with MarcEdit (if you have mysql installed) -- but on a laptop -- I'm not going to guarantee speed with the process. Plus, the process to generate the SQL data will be significant. It might take 15 hours to generate the database, but then you'd have it and could create indexes on it. But you could use it to create the database and then prep the files for later work. --TR
Re: [CODE4LIB] Imaging Hosting Services
On Wed, Feb 27, 2013 at 03:39:46PM -0500, DYV wrote: Hello All, We are considering an image host for a special collection. The collection would be private and only viewable via links added to and searched through our online catalog (InMagic) Has anyone used either of the following and had a positive experience: - ImageShack - Flickr Pro or Nonprofits(Flickr/Yahoo) - WebLife Photo (Earthlink) These are the ones recommended to me thus far. We're looking to upload around 6 gig of files. I'd appreciate any insight any of you might have. I've been using Flickr personally for a while but there's a new kid in town who I am liking quit a bit for upload and *hide* away and still enjoy the gallery like features I've gotten used to with Flickr. They let you try it out for 30 days. (I am impulsive and paid after a week) https://www.everpix.com/landing.html Cheers, ./fxk Peace Desiree Yael Vester Caretaker, OPAC Coordinator Lesbian Herstory Archives http://lesbianherstoryarchives.org/ -- If one studies too zealously, one easily loses his pants. -- A. Einstein.
[CODE4LIB] Job: Metadata Librarian/Cataloger at University of Maine
Metadata Librarian/Cataloger: Raymond H. Fogler Library at the University of Maine is looking for individuals with a mastery of traditional cataloging standards and practices who are willing to explore and apply emerging schemes for resource discovery; works in a highly collaborative environment. Duties include: developing and implementing solutions using emerging schema to ensure user-centered access to all resources; tracking developments on metadata standards; original cataloging of a wide variety of formats including born digital materials, media and maps with contribution to national utilities; participation in training and support activities for cataloging staff. Required: Typically has the education associated with an ALA-accredited MLS and some professional experience or an equivalent combination of experience and education; knowledge of metadata and cataloging standards schema such as Dublin Core, AACR2, RDA, LCSH and a national bibliographic utility such as OCLC; previous experience cataloging materials in print and electronic formats ; experience with web page development; excellent oral and written communication skills; demonstrated successful experience in working independently and as part of a team. Preferred: Experience with Innovative Interfaces' Millennium information systems; experience with tools related to the loading and integration of MARC records; experience digitizing and providing access to special or archival collection. The University of Maine is the Land Grant University and Sea Grant College for the State of Maine. It is the flagship institution of the University of Maine System, offering bachelors, masters, and doctoral degrees. The University of Maine has approximately 11,300 students and 728 faculty. The Raymond H. Fogler Library has a collection of more than a million volumes and a staff of 24 professionals and 45 support staff. The library uses the INNOPAC integrated system and has developed Mariner, a digital library. It is a Tri-State Regional Depository and a full patent depository. This is a 12 month, full-time position with a projected starting salary range of $40,461-$45,000 and an excellent benefits package. More information regarding the job description may be found at http://jobs.umaine.edu/. Review of applications will begin immediately. Send letter of application, resume, and the names, addresses, telephone numbers, and e-mail addresses of three references to Karen Stewart, Office of the Dean of Libraries, 5729 Fogler Library, University of Maine, Orono, ME 04469-5729 or karen.stew...@umit.maine.edu The University of Maine is an Equal Opportunity/Affirmative Action Employer. The University website is: www.umaine.edu On January 1, 2011, UMaine became a tobacco-free campus. Information regarding UMaine's tobacco-free policy is online at http://umaine.edu/tobaccofree/. Appropriate Background Checks Required. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6514/
[CODE4LIB] Job: Electronic Records Archivist at Harry Ransom Center
This full-time, professional archivist position will lead the stewardship of born-digital archival materials as well as oversee the EAD markup and delivery of online finding aids for the Ransom Center. Purpose To preserve, arrange, describe, and deliver born-digital manuscript and archival materials and to oversee EAD markup and delivery of online finding aids. Essential Functions Lead the Ransom Center's ongoing development and implementation of policies and procedures for the stewardship of born-digital archival materials in the archival holdings, ensuring effective accessioning, description, preservation, and delivery of born-digital archival materials. Act as born-digital archives curator for the Ransom Center, participating in collection development, appraisal, management, reference and exhibitions, alone and with other curators, and support other staff in their work to acquire, manage, and make available born-digital archival materials. Manage EAD finding aid files. Update EAD procedures as needed and instruct staff. Supervise LA II or other staff in review of xml finding aids and other projects. Process traditional and hybrid manuscript collections. Engage in campus, regional, and national scholarly and professional organizations and activities. Required qualifications Master's Degree in Library or Information Science or equivalent advanced degree; demonstrated professional experience preserving, describing, and delivering born-digital archival material; demonstrated experience creating, reviewing, and delivering EAD finding aids. Equivalent combination of relevant education and experience may be substituted as appropriate. Preferred Qualifications Two or more years professional experience working with born digital archival materials in a research library setting; experience processing non-digital and hybrid archival material; experience supervising student and paraprofessional staff; demonstrated knowledge of digital preservation and access technologies, standards, and best practices; experience with XML encoding and stylesheets; experience using databases in an archival setting; knowledge of archival metadata standards; experience creating archival finding aids and catalog records using AACR2 and DACS; demonstrated ability to work with attention to detail and accuracy and to work both independently and under supervision. Working conditions May work around standard office conditions May work around electrical and mechanical hazards Repetitive use of a keyboard at a workstation Use of manual dexterity Climbing of ladders Lifting and moving Standard archival storage environment; occasional use of step stool or ladder; lifting and moving computer equipment and boxes up to 40 pounds. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6520/
[CODE4LIB] Job: Assistant Director - Library and Information Commons at Essex County College
Essex County College is currently seeking a qualified individual to serve as Assistant Director in our Library and Information Commons Department. The Assistant Director is responsible for overseeing the operations and related activities of the College Information Commons and for providing general administration and technical support for the College Libraries. Qualified candidates will possess a Master's degree in Library and Information Science (MLIS), Library Science (MLS) or related area. The candidate will have experience in information science and multimedia technology, and their application in diversified, innovative learning-teaching situations and have working knowledge of one of the following programming languages: PERL, PHP, Python, Java, Spring, Groovy on Grails, JavaScript, XML, or Ajax . The ideal candidate will have a minimum of two years experience working in a public and/or academic library. RESIDENCY REQUIREMENTS The selected candidate will be required to establish principal residency in the State of New Jersey within 365 days from the date of hire, unless otherwise exempt. TO APPLY Resume and a letter of interest, indicating the position and reference# (REF#), may be sent to the attention of Human Resources. Applications will be reviewed until the position is filled. Human Resources Department Ref# ADLIB/HEJ Essex County College 303 University Avenue Newark, NJ 07102 Fax (973) 877-3409 Email: j...@essex.edu Website: www.essex.edu Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6525/
[CODE4LIB] Job: Program Director, Academic Preservation Trust at University of Virginia
The Academic Preservation Trust (APTrust) [www.aptrust.org] is an innovative consortium committed to creation and management of a sustainable environment for digital preservation and aggregate repository services for academic and research content. Community collaboration is central to APTrust's operating philosophy. The Program Director will work closely with the APTrust Advisory Group to lead its evolving, ambitious vision and to grow the existing project into an innovative and trusted preservation solution. The Director will be responsible for building APTrust into a self-supporting, full-service entity by developing partner relationships and leveraging participation of APTrust members. APTrust currently has 12 academic research partner libraries from across the country. The University of Virginia Library is leading the incubation of APTrust's services: a repository for long-term preservation of content; a replicating node for the Digital Preservation Network (DPN); and future services that may include disaster recovery, format migration, or hosted repositories. APTrust will leverage the resources of the existing preservation community to promote collaborative solutions and best practices in digital preservation. Primary Responsibilities Provide strategic vision: -Articulate a big picture vision for APTrust, convey its value and impact to the scholarly community and beyond -Work closely with the APTrust advisory group and the wider partner community, identify near-term and long-term strategic goals -Measure and evaluate outcomes -Develop and gain support for a business model that will sustain the project following the start-up period Lead successful operations: -Build a dynamic and effective core team, augmented by contributors from consulting agencies and/or member institutions -Manage projects and staff to ensure timely implementation of products and services -Plan and manage budgets, fund raising, and business operations -Determine outreach needs to local institutions and ensure appropriate assistance for ingesting and managing content stored in APTrust -Provide status and financial reports to advisory group and the broader membership Coordinate outreach and communication: -Actively promote APTrust, DPN, and the wider cause of digital preservation to the scholarly community and other key stakeholders -Develop effective communication mechanisms for continued engagement with member institutions and enrolling new members from the greater community -Seek out and engage in collaborations that will leverage resources and expertise for the advancement of digital preservation Skills and Competencies Required: -A Masters degree with at least 7-10 years of progressively responsible experience in higher education and/or business -Excellent project management skills and demonstrated success managing teams working in disparate locations -Entrepreneurial skills, especially the ability to successfully promote innovative concepts and enroll stakeholders in new solutions -Strong ability to think and act strategically; demonstrated success at bringing concepts to realization Preferred: -Experience with digital preservation issues and solutions and working with libraries and/or IT organizations -Analytical skills in crafting successful funding and business models for innovative projects -Able to communicate effectively in person and virtually using a variety of technologies To Apply: Complete a Candidate Profile, attach a cover letter, cv, and contact information for three professional references through Jobs@UVA (Posting #0611575). The University of Virginia is an affirmative action/equal opportunity employer committed to diversity, equity, and inclusiveness. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6530/
[CODE4LIB] Job: APS 6, Systems Administrator - Databases at National Library of Australia
Systems Administrator Database, Windows, Linux, Unix/Solaris Experience The National Library of Australia is seeking a talented and motivated Systems Administrator to work in a small team environment, support and maintain the National Library's Linux and Solaris based server and infrastructure, websites and business applications. The successful applicant will work with in-house developers and external vendors to install, upgrade, configure and debug Integrated Library Management Systems. The Library is a leader in the innovative use of digital library technologies to support the acquisition, preservation and dissemination of born digital and digitised collections. Successful applicants will have the opportunity to contribute to interesting and challenging projects that build on this legacy. The advertised positions are ongoing roles in the Business Systems Support team within the Information Technology Division. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/6533/