Re: [CODE4LIB] what good books did you read in 2014?
Harun Farocki’s Nachdruck/Imprint (2001) seems worth recommending at this time. He passed this year. In honor of CIA report then, -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 From: Matthew Sherman matt.r.sher...@gmail.commailto:matt.r.sher...@gmail.com Reply-To: Code for Libraries CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Date: Tuesday, December 9, 2014 at 2:06 PM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] what good books did you read in 2014? Nothing professional comes to mind but here are some fun stuff in no particular order: Books: Skin Game by Jim Butcher - Another in the consistently great Dresden Files series. For those unfamiliar urban fantasy novels that are always just a fun read. The Broken Eye by Brent Weeks - The third in the Lightbringer series from a newer but really good fantasy author. Comics: Avengers vol. 5 and New Avengers vol. 3 by Jonathan Hickman - The current run on Avengers and New Avengers, both written by Jonathan Hickman who is good at playing the long game and paying off well as proven by his run on Fantastic Four. Batman vol. 2 by Scott Snyder - The current run on Batman by Scott Snyder who has been consistently a great batman author, and currently doing a very interesting Joker story. Movies: Guardians of the Galaxy - Great movie as Andromeda mentioned. As a fan of the book it was based on I was afraid this was going to be awful and was pleasantly surprised. TV: The Flash - The new Flash show has been one of the most fun TV shows I have seen in quite some time, they have a very fun dynamic and surprisingly good production values. Games: Dragon Age: Inquisition - Another great Bioware RPG, with real pay off if you have played the previous games. Even if you haven't it is a lot of fun and a pretty good story. Admittedly I am only part way in, but when it took the reviewers 80 hours to finish the story it is not something you will finish within the first month of getting it. On Tue, Dec 9, 2014 at 1:34 PM, Mark Pernotto mark.perno...@gmail.commailto:mark.perno...@gmail.com wrote: Fun question - thanks! In no particular order: *What If?: Serious Scientific Answers to Absurd Hypothetical Questions* by Randall Munroe - *I really enjoy the physics, as well as the absurdity.* *Two Scoops of Django 1.6* - *based on Andromeda's recommendation - thanks! Looks like I have another Django book to read now. Really appreciate it!* *Invincible Compendium Volume 2* by Robert Kirkman - *someone had gifted me Compendium 1 last Christmas - I just had to continue. I feel accomplished after reading such a large book* *Wonders of Life* by Brian Cox - *I know there's a lot of hype surrounding Neil Degrasse Tyson's Cosmos, but I prefer Cox's presentation. He also did a series Wonders of the Universe and Wonders of the Solar System years ago. If you hurry, you can get the 3-series BluRay set for $0.12 cheaper than just Wonders of Life* On Tue, Dec 9, 2014 at 6:47 AM, Andromeda Yelton andromeda.yel...@gmail.commailto:andromeda.yel...@gmail.com wrote: Hey, code4lib! I bet you consume fascinating media. What good books did you read in 2014 that you think your colleagues would like, too? (And hey, we're all digital, so feel free to include movies and video games and so forth.) Mine: http://www.obeythetestinggoat.com/ (O'Reilly book, plus read free online) - a book on testing from a Django-centric, front end perspective. *Finally* I get how testing works. This book rewrote my brain. _The Warmth of Other Suns_ - finally got around to reading this magnum opus history of the Great Migration, am halfway through, it's amazing. If you're looking for some historical context on how we got to Ferguson, Isabel Wilkerson has you covered. _Her_ - Imma let you finish, Citzenfour and Big Hero 6 and LEGO movie and Guardians of the Galaxy - you were all good - but I walked out of the theater and literally couldn't speak after this one. Plus, funniest throwaway scene ever. Almost fell out of my chair. _Tim's Vermeer_ - wait, no, watch that one too. Weird tinkering genius who can't paint obsesses over recreating a Vermeer with startling, physics-driven results. Also, Penn Jillette. -- Andromeda Yelton Board of Directors, Library Information Technology Association: http://www.lita.org Advisor, Ada Initiative: http://adainitiative.org http://andromedayelton.com @ThatAndromeda http://twitter.com/ThatAndromeda ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately
Re: [CODE4LIB] Requesting a Little IE Assistance
Quick answer, sorry: might require some css http://msdn.microsoft.com/en-us/library/ie/ms531186(v=vs.85).aspx Alternately Notepad ++? It’s not a crazy question: .txt only wins as a file if people realize it can be read. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 From: Matthew Sherman matt.r.sher...@gmail.commailto:matt.r.sher...@gmail.com Reply-To: Code for Libraries CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Date: Monday, October 13, 2014 at 9:59 AM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Requesting a Little IE Assistance For anyone who knows Internet Explore, is there a way to tell it to use word wrap when it displays txt files? This is an odd question but one of my supervisors exclusively uses IE and is going to try to force me to reupload hundreds of archived permissions e-mails as text files to a repository in a different, less preservable, file format if I cannot tell them how to turn on word wrap. Yes it is as crazy as it sounds. Any assistance is welcome. Matt Sherman ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] What is the real impact of SHA-256? - Updated
I’m not sure I understand the prior comment about compression. I agree that hashing workflows are not simple nor of-themselves secure. I agree with the implication that they can explode in scope. From what I can tell, the state of hashing verification tools reflects substantial confusion over their utility and purpose. In some ways it’s a quixotic attempt to re-invent LOCKSS or equivalent. In other ways it’s perfectly sensible. I think that the move to evaluate SHA-256 reflects some clear concern over tampering (as does the history of LOCKSS e.g. Itself). This is not to say that MD5 collisions (much less, substitutions) are mathematically trivial, but rather, that they are now commonly contemplated. Compare Bruce Schneier’s comments about abandoning SHA-1 entirely, or computation’s reliance on Cyclic Redundancy Checks. In many ways it’s an InfoSec consideration dropped in the middle of archival or library workflow specification. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 From: Charles Blair c...@uchicago.edumailto:c...@uchicago.edu Organization: The University of Chicago Library Reply-To: c...@uchicago.edumailto:c...@uchicago.edu c...@uchicago.edumailto:c...@uchicago.edu Date: Friday, October 3, 2014 at 10:26 AM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] What is the real impact of SHA-256? - Updated Look at slide 15 here: http://www.slideshare.net/DuraSpace/sds-cwebinar-1 I think we're worried about the cumulative effect over time of undetected errors (at least, I am). On Fri, Oct 03, 2014 at 05:37:14AM -0700, Kyle Banerjee wrote: On Thu, Oct 2, 2014 at 3:47 PM, Simon Spero sesunc...@gmail.commailto:sesunc...@gmail.com wrote: Checksums can be kept separate (tripwire style). For JHU archiving, the use of MD5 would give false positives for duplicate detection. There is no reason to use a bad cryptographic hash. Use a fast hash, or use a safe hash. I have always been puzzled why so much energy is expended on bit integrity in the library and archival communities. Hashing does not accommodate modification of internal metadata or compression which do not compromise integrity. And if people who can access the files can also access the hashes, there is no contribution to security. Also, wholesale hashing of repositories scales poorly, My guess is that the biggest threats are staff error or rogue processes (i.e. bad programming). Any malicious destruction/modification is likely to be an inside job. In reality, using file size alone is probably sufficient for detecting changed files -- if dup detection is desired, then hashing the few that dup out can be performed. Though if dups are an actual issue, it reflects problems elsewhere. Thrashing disks and cooking the CPU for the purposes libraries use hashes for seems way overkill, especially given that basic interaction with repositories for depositors, maintainers, and users is still in a very primitive state. kyle -- Charles Blair, Director, Digital Library Development Center, University of Chicago Library 1 773 702 8459 | c...@uchicago.edumailto:c...@uchicago.edu | http://www.lib.uchicago.edu/~chas/ ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Archival File Storage
This is a live topic. Suggestions http://e-records.chrisprom.com/recommendations/ http://www.metaarchive.org/GDDP For our CONTENTdm to MetaArchive workflow we use Bagit, and we archive the masters, not the site. http://libraryofcongress.github.io/bagit-python/ Al -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 8/13/14, 12:40 PM, Will Martin w...@will-martin.net wrote: As with most libraries, we're accumulating an increasing number of digital holdings. So far, our approach to storing these files consists of a haphazard cocktail of: - A ContentDM site whose contents haven't been updated in three years - live network storage in the form of shared drives - a Drobo - CDs and DVDs - hard drives stored in static-proof bags, and - ancient floppy disks whose contents remain a mystery that would surely scour the last vestiges of sanity from our minds if we had a 5 1/4 drive to read them with. In short it's a mess that has evolved organically over a long period of time. I'm not entirely sure what to do about it, especially considering our budget for improving the situation is ... uh, zero. At the very least, I'd like a better sense for what is considered a good approach to storing archival files. Can anyone recommend any relevant best practices or standards documents? Or just share what you use. I'm familiar with the OAIS model for digital archiving, and it seems well thought-out, but highly abstract. A more practical nuts-and-bolts guide would be helpful. Thanks. Will Martin Web Services Librarian University of North Dakota ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Bandwidth control
Like most things, if you want to do this, you probably can do it yourself http://web.opalsoft.net/qos/default.php ; and then Cisco, who also happen to make really big switches, get additional points for abstracting away some low-level decisions. Traffic-shaping is a lively commercial industry at this time, not least because it dovetails with deep-packet inspection in certain use cases like, how do I retain my hold on power in Egypt or Tunisia. I don’t mean to be a bummer though. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 8/4/14, 4:07 PM, Carol Bean beanwo...@gmail.com wrote: Thanks, Scott. I appreciate the details. I hadn't thought of investigating firmware hacks. I have heard Cisco routers are being used to manage bandwidth, and are, as expected, a pricey solution. Carol On Aug 4, 2014, at 7:34 PM, Scott Fisher wrote: I don¹t know about libraries, but there are some technical solutions to problems like these. One approach to reducing bandwidth may be bandwidth throttling in the router settings for the router the library uses. This limits the download/upload rates for a client or clients and may limit high resolution video viewing because the connection then could be set to throttle at a speed too slow to view some or all high-resolution streaming versions of videos in real time. This may also make it so that one user isn¹t hogging and saturating the internet connection and slowing the network for all other users. I've seen this kind of throttling in hotels that supply a free low speed connection that is good enough for checking email and browsing the web, but not fast enough for streaming video (they then may allow it if you pay an extra fee). There may also be ways to set daily bandwidth quotas for each client in the router settings for some routers. Many consumer routers do not have these settings, but more expensive professional-level routers or alternative firmwares for consumer routers might have the settings. For example, DD-WRT or Tomato are custom firmwares for some routers that may allow you to configure settings like this if someone has released something for your specific brand/model of router. For example a Tomato firmware by shibby has settings like this http://tomato.groov.pl/wp-content/gallery/screenshots/bwlimiter.png . I don¹t know if that helps or is what you¹re looking for. On 8/4/14, 7:20 AM, Carol Bean beanwo...@gmail.com wrote: A quick and dirty search of the list archives turned up this topic from 5 years ago. I am wondering what libraries (especially those with limited resources) are doing today to control or moderate bandwidth, e.g., where viewing video sites uses up excessive amounts of bandwidth? Thanks for any help, Carol Carol Bean beanwo...@gmail.com ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Python in Your Library
Language is a very personal issue, and this has been discussed before; maybe search the Code4Lib archives for a nice Python thread in 2013. But we’ve been using python3-pandas for data analysis and it’s a nice library. https://vimeo.com/59324550 -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 5/7/14, 9:13 AM, Julia caffr...@simmons.edu wrote: Hi All, This is my first time posting to Code4Lib. Now seems like a good time. I am wondering how you have applied Python in your library. What projects have been successful? What have you heard of other libraries doing? What advantages or disadvantages does it have compared to other scripting languages used in the library field? If you have any thoughts on any of those questions, I'd love to hear from you. Thanks, Julia caffr...@simmons.edu Simmons College Library ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Python in Your Library
I believe it’s via https://listserv.nd.edu/cgi-bin/wa?REPORTz=41=CODE4LIBL=CODE4LIB or else please correct me, list. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 5/7/14, 9:36 AM, Joseph Umhauer jumha...@niagara.edu wrote: Hi, Al, How do you access the Code4Lib archives j0e Joseph Umhauer Assistant Library Director for Technical Services Niagara University Library 716-286-8015 jumha...@niagara.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of Al Matthews Sent: Wednesday, May 07, 2014 9:17 AM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] Python in Your Library Language is a very personal issue, and this has been discussed before; maybe search the Code4Lib archives for a nice Python thread in 2013. But we’ve been using python3-pandas for data analysis and it’s a nice library. https://vimeo.com/59324550 -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 5/7/14, 9:13 AM, Julia caffr...@simmons.edu wrote: Hi All, This is my first time posting to Code4Lib. Now seems like a good time. I am wondering how you have applied Python in your library. What projects have been successful? What have you heard of other libraries doing? What advantages or disadvantages does it have compared to other scripting languages used in the library field? If you have any thoughts on any of those questions, I'd love to hear from you. Thanks, Julia caffr...@simmons.edu Simmons College Library ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** ** ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Looking for two coders to help with discoverability of videos
+1 for what I know of Avalon Media service -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 11/12/13 8:21 AM, Edward Summers e...@pobox.com wrote: Hi Kelley, Thanks for posting this. When I began work on jobs.code4lib.org I was hoping it would encourage people to post short term contracts. The thought being that it may be easier for some institutions to find money for projects than full-time staff, and it could encourage more open source collaboration between organizations, similar to what the Hydra Project are doing. So, I added your post to jobs.code4lib.org [1]. Ordinarily the person who publishes a job posting is the only one who can edit it. But if you would like to make any changes to it please let me know and I’ll make you the editor. Incidentally I was curious about your decision to hire two programmers to do what appears to be a very similar task. Was your intent to have two implementations to compare to see which you liked better? Were the two developers supposed to work together or separately? //Ed [1] http://jobs.code4lib.org/job/10658/ On Nov 11, 2013, at 10:58 PM, Kelley McGrath kell...@uoregon.edu wrote: I have a small amount of money to work with and am looking for two people to help with extracting data from MARC records as described below. This is part of a larger project to develop a FRBR-based data store and discovery interface for moving images. Our previous work includes a consideration of the feasibility of the project from a cataloging perspective (http://www.olacinc.org/drupal/?q=node/27), a prototype end-user interface (https://blazing-sunset-24.heroku.com/, https://blazing-sunset-24.heroku.com/page/about) and a web form to crowdsource the parsing of movie credits (http://olac-annotator.org/#/about). Planned work period: six months beginning around the second week of December (I can be somewhat flexible on the dates if you want to wait and start after the New Year) Payment: flat sum of $2500 upon completion of the work Required skills and knowledge: * Familiarity with the MARC 21 bibliographic format * Familiarity with Natural Language Processing concepts (or willingness to learn) * Experience with Java, Python, and/or Ruby programming languages Description of work: Use language and text processing tools and provided strategies to write code to extract and normalize data in existing MARC bibliographic records for moving images. Refine code based on feedback from analysis of results obtained with a sample dataset. Data to be extracted: Tasks for Position 1: Titles (including the main title of the video, uniform titles, variant titles, series titles, television program titles and titles of contents) Authors and titles of related works on which an adaptation is based Duration Color Sound vs. silent Tasks for Position 2: Format (DVD, VHS, film, online, etc.) Original language Country of production Aspect ratio Flag for whether a record represents multiple works or not We have already done some work with dates, names and roles and have a framework to work in. I have the basic logic for the data extraction processes, but expect to need some iteration to refine these strategies. To apply please send me an email at kelleym@uoregon explaining why you are interested in this project, what relevant experience you would bring and any other reasons why I should hire you. If you have a preference for position 1 or 2, let me know (it's not necessary to have a preference). The deadline for applications is Monday, December 2, 2013. Let me know if you have any questions. Thank you for your consideration. Kelley PS In the near future, I will also be looking for someone to help with work clustering based on title, name, date and identifier data from MARC records. This will not involve any direct interaction with MARC. Kelley McGrath Metadata Management Librarian University of Oregon Libraries 541-346-8232 kell...@uoregon.edu
Re: [CODE4LIB] mass convert jpeg to pdf
Nice. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 11/12/13 9:59 AM, Andrew Hankinson andrew.hankin...@gmail.com wrote: Just thought I might plug some software we're developing to solve the book image navigation misery that Kyle mentions. http://ddmal.music.mcgill.ca/diva/ and a demo: http://ddmal.music.mcgill.ca/newdiva/demo/single.html We developed it because we were frustrated with the image gallery paradigm for book image viewing, and wanted something more like Google Books' viewer, but with access to the highest resolution possible. We also were frustrated with having to download large PDFs to just view a couple pages. Diva uses IIP on the back-end to serve out image tiles, so you're only ever downloading the part of the image that's viewable -- the rest is auto-loaded as the user scrolls. We've used it to display a manuscript that's ~80GB (total), with each image around 200MB. http://coltrane.music.mcgill.ca/salzinnes/experiments/diva-cci-tif/ It's also got a couple other neat features, like in-browser brightness/contrast/rotation adjustments via canvas. (Click the little gear icon in the top left of each page image). Cheers, -Andrew On 2013-11-08, at 4:22 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle
Re: [CODE4LIB] Python applications for libraries
Python is a wonderful language in many respects. We use it instead of Ruby in a number of projects, most notably in workflow for Digital Preservation. I do know of a number of enterprise developers using it in a web stack -- with Flask, with Werkzeug, with Twisted, with stuff I'm not aware of, depends on scale and whom you ask -- or else Django. We do not do so at this time. Ruby may be more broadly applicable in the present library context, or, not. Unclear. Python has a fairly strict diction and the present split existence between 2 and 3 can be annoying. But it's a useful language, increasingly used for hosting other languages, and increasingly, fast despite all odds. Good for toying with functional approaches. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 10/18/13 9:14 AM, Joseph Umhauer jumha...@niagara.edu wrote: I'm considering taking on online course for programming using Python. But not sure if it would be useful in my work at an academic library. My question is: If you are using Python, what applications have you developed for your institution? TIA j0e Joseph Umhauer Assistant Library Director for Technical Services Niagara University Library 716-286-8015 jumha...@niagara.edu
Re: [CODE4LIB] Python applications for libraries
There's nothing wrong with Perl. Also cf this perhaps https://wiki.python.org/moin/PerlPhrasebook . http://www.python.org/getit/windows/ , and http://www.lfd.uci.edu/~gohlke/pythonlibs/ is a kind provision -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 10/18/13 10:00 AM, Kaile Zhu kz...@uco.edu wrote: Python, Python, Python. Sigh. Theoretically, programming language should be neutral, right?. Any languages could do the job if OS allows. I used to work in a small academic library. Learning programming languages was purely self-motivated and taught. By chance, the path I have treaded on is Perl - PHP - ASP - ASP.NET. Starting with Perl made sense when I was in the library school in 1994, as it was almost a de facto Web language. Then, PHP was almost a natural extension of Perl. Then, .NET fever hit the world in the early 2000's. What in the earth was Python at that time? Being so popular in the library world, I wish I knew it earlier so that I could learn it instead of other languages. The same as Ruby. I am jealous. With heavy load of work every day, do I have time to learn a new language? Kelly -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Heidi P Frank Sent: 2013年10月18日 8:32 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Python applications for libraries Hi Joe, as a cataloger, I've used Python for working with raw MARC records - using the PyMarc library - as well as MARCXML and EADXML records. It allows me to analyze and modify large files of MARC records in batch. cheers, heidi Heidi Frank Electronic Resources Special Formats Cataloger New York University Libraries Knowledge Access Resources Management Services 20 Cooper Square, 3rd Floor New York, NY 10003 212-998-2499 (office) 212-995-4366 (fax) h...@nyu.edu Skype: hfrank71 On Fri, Oct 18, 2013 at 9:22 AM, Al Matthews amatth...@auctr.edu wrote: Python is a wonderful language in many respects. We use it instead of Ruby in a number of projects, most notably in workflow for Digital Preservation. I do know of a number of enterprise developers using it in a web stack -- with Flask, with Werkzeug, with Twisted, with stuff I'm not aware of, depends on scale and whom you ask -- or else Django. We do not do so at this time. Ruby may be more broadly applicable in the present library context, or, not. Unclear. Python has a fairly strict diction and the present split existence between 2 and 3 can be annoying. But it's a useful language, increasingly used for hosting other languages, and increasingly, fast despite all odds. Good for toying with functional approaches. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 10/18/13 9:14 AM, Joseph Umhauer jumha...@niagara.edu wrote: I'm considering taking on online course for programming using Python. But not sure if it would be useful in my work at an academic library. My question is: If you are using Python, what applications have you developed for your institution? TIA j0e Joseph Umhauer Assistant Library Director for Technical Services Niagara University Library 716-286-8015 jumha...@niagara.edu **Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue, and Green! Please print this e-mail only if absolutely necessary! **CONFIDENTIALITY** This e-mail (including any attachments) may contain confidential, proprietary and privileged information. Any unauthorized disclosure or use of this information is prohibited.
Re: [CODE4LIB] pdf2txt
+1 https://www.documentcloud.org/opensource -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 10/15/13 4:23 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote: Eric, You might want to consider using http://www.documentcloud.org to host your users document. That would also take care of privacy/authentication concerns. I know of a project in journalism domain (http://overview.ap.org/) which does that. As far as I remember they do provide an API interface and do some named entity recognition as well. Regards, Arash -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: 11 October 2013 18:58 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] pdf2txt On Oct 11, 2013, at 1:49 PM, Matthew Sherman matt.r.sher...@gmail.com wrote: For a limited period of time I am making publicly available a Web-based program called PDF2TXT -- http://bit.ly/1bJRyh8 Very slick, good work. I can see where this tool can be very helpful. It does have some issues with some characters, but this is rather common with most systems. Again, thank you for the support. Yes, there are some escaping issues to be resolved. Release early. Release often. I need help with the graphic design in general. Here's an enhancement I thought of: 1. allow readers to authenticate 2. allow readers to upload documents 3. documents get saved in readers' cache 4. allow interface to list documents in the cache 5. provide text mining services against reader-selected documents 6. go to Step #1 It would also be cool if I could figure out how to finish the installation of Tesseract to enable OCRing. [1] [1] OCRing - http://serials.infomotions.com/code4lib/archive/2013/201303/1554.html -- Eric Morgan - No virus found in this message. Checked by AVG - www.avg.com Version: 2014.0.4142 / Virus Database: 3604/6734 - Release Date: 10/08/13
Re: [CODE4LIB] Python and Ruby
Functional programming FTW! On 7/30/13 10:27 AM, Mark A. Matienzo mark.matie...@gmail.com wrote: i don't know why we're not talking about Haskell - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Sign up to present at the Code4Lib virtual lightning talks -- June 14, 2013
Seconding autumn please. Midsummers are somehow at once busy and vague. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 6/4/13 5:48 PM, Peter Murray peter.mur...@lyrasis.org wrote: Unless there is a sudden spurt of interest in presenting at the Code4Lib Virtual Lightning Talks at the end of next week, I'm going to cancel it and propose another time in the fall. In this experiment, it might be that virtual lightning talks ever 10 weeks is too close together. Feedback, as always, is welcome. Peter On May 14, 2013, at 11:00 AM, Peter Murray peter.mur...@lyrasis.org wrote: In a little less than a month I'll be hosting a Code4Lib Virtual Lightning Talks session. These are six minute talks on topics ranging from library technology to technology culture to just about anything you think the Code4Lib community would be interested in hearing. Details about how the virtual lightning talks are run and the space to sign up can be found at: http://wiki.code4lib.org/index.php/Virtual_Lightning_Talks Peter -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 800.999.8558 x2955 - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] BeagleBone Black, anyone?
My 2c, I like them. Use them if you want to study embedded. Processorwise they're pitched between a smartphone and an Arduino. They have onboard DSP and will play 1080 HD video without an issue if you ask nicely. If you'll pardon the distinction, you can run either as Linux or as Android. Entry experience seems to me easier than the Raspberry PI. Potentially useful things to know: * Not all 5v power adapters are created equal. BB Blacks power over USB but, if you're using a USB wall wart, your cell phone charger may not do, even if it says it will. * HDMI on the Beaglebone Black is not a full-sized HDMI but rather a micro type D. * Beaglebone Black is brand new and most Googled info is still original Beaglebone. * Beaglebone Black has no audio output hardware of which I am immediately aware. I guess you have to rely on audio over USB or HDMI. * Raspberry PI by the way does not implement OPENGL ES or at least had not last time I deployed anything; surely that's dated information by now. Applications I've heard of: * Front-end to an NAS, streaming media server, Archivematica, e.g. * Lots of people use them for OpenCV, so think in those terms: I don't just need an Arduino w/ sensor, I want to run some analysis on my camera-in signal, on the board. * http://beagleboard.org/project -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 5/20/13 11:44 AM, Roy Tennant roytenn...@gmail.com wrote: Is anyone working with a BeagleBone Black? [1] Or some other Beagleboard? In perhaps a cart-before-the-horse kind of way, I'd love to do a project with one but I'm having a hard time thinking of a really good application. So I'd be interested to hear about the kinds of things folks are doing with these. Roy [1] http://beagleboard.org/ - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] ADVICE: Applied Computing Program at Tulane
If those are really your interests, I'd look at a strictly HCI program (they're out there) Georgia Tech has a good HCI M.S., which I believe I would recommend, and a parallel Digital Media Program, which is also strong. Both do spin out good UX and IA people. Both programs are competitive but I believe they remain funded at Masters level. As a separate observation, if you're more deeply invested in the semantic stuff, it can't hurt to spend your extra coursework in machine learning or AI.. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 4/22/13 12:05 PM, Cary Gordon listu...@chillco.com wrote: If those are really your interests, I'd look at a strictly HCI program (they're out there) - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] ADVICE: Applied Computing Program at Tulane
I would argue that it's intimidating to learn programming entirely on one's own. An alternative to sitting down after work with IDLE and a book, is for example https://www.coursera.org/signature/course/interactivepython/970391 I'll emphasize that this is the first pay-for coursera course that I've seen. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 I learn best by getting my hands dirty with a project. See if you like it first, and see if you can't follow along with a 'how to program' guide online - this helped me: http://learnpythonthehardway.org/. The HTML version is free, you'll see immediate results, and it might give you a good idea if you like this whole 'programming' thing. On Mon, Apr 22, 2013 at 9:05 AM, Cary Gordon listu...@chillco.com wrote: If you going to become a professional programmer/developer, I suggest that you take one of the language courses (just not ASP). In the library world, XML is very useful. While we work mostly in PHP, Python, Ruby and Scala are the most interesting, but none of them are on the list. In my experience, if you have a good handle on the fundamentals of programming, picking up new languages is easy. These are tough choices, as there is only one class — ASP is dead — that I wouldn't take. What are the other two concentration options? On Mon, Apr 22, 2013 at 8:41 AM, Sean Hannan shan...@jhu.edu wrote: Honestly, if you're interested in and looking to focus on Content Strategy and UX, the only course there that comes close is Human-Computer Interaction. If those are really your interests, I'd look at a strictly HCI program (they're out there) or something that leans more towards Knowledge Management or plain old Design. -Sean From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Phil Suda [ps...@neworleanspubliclibrary.org] Sent: Monday, April 22, 2013 11:31 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] ADVICE: Applied Computing Program at Tulane Good morning, I have been working in public libraries since 2006, as a cataloger, collection development librarian, serials librarian, and various other roles (thinking of business card with Fixer as job title). I am very interested in Structured Data, Semantic Web, Metadata, and more importantly Content Strategy and User Experience/Interface Design. I am considering entering the Applied Computing Program at Tulane University. I have listed the courses below. What advice do the Code4Libs have with regard to Programming Courses via a University (as well as the courses below)? I really want to get into Content Strategy and User Experience Design. What advice do you have for someone that is a librarian with a pretty extensive knowledge of metadata/structured data, is interested in programming/coding as a career, and just wants to improve his lot/career? Thank you for any and all advice on the matter. Thanks, Phil Major Core Courses Credits CPST 1200 Fundamentals of Information Systems and Information Technology CPST 2200 Programming Fundamentals CPST 2300 Database Fundamentals CPST 3600 IT Hardware and Software Fundamentals CPST 3700 Networking Fundamentals CPST 3900 Fundamentals of Information Security and Assurance In addition to the major core courses above, Applied Computing majors must select 6 additional courses from one of the 3 following concentration options: Option 1: Integrated Application Development Concentration Credits Select one course: CPST 3220 O-O Programming with Java CPST 3230 Programming in C++ CPST 3400 Website Development with XML/XHTML CPST 3410 Website Development with JavaScript CPST 3430 Website Development with ASP CPST 3310 Relational Database Design and Development CPST 3250 Human-Computer Interaction CPST 3550 Systems Analysis and Design CPST 4250 Integrated Application Development One CPST Elective (2000 level or above) -- Cary Gordon The Cherry Hill Company http://chillco.com - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] ElasticSearch
In context of logstash, more or less from source but not in production. That's mostly a +1 to the idea though. Interested to hear thoughts. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 3/14/13 2:46 PM, Cary Gordon listu...@chillco.com wrote: Anyone using it? Thanks, Cary -- Cary Gordon The Cherry Hill Company http://chillco.com - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] XML Parsing and Python
Hello Mike, I realize minidom is a pure python library, but I wonder if elementtree isn't preferred here since you're already using lxml? I think the latter must be based on the former. Or for a bit of a snark, try, e.g. http://blog.ianbicking.org/2008/03/30/python-html-parser-performance/ .. Bicking: I don't recommend using minidom for anything. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 3/7/13 10:49 AM, Michael Beccaria mbecca...@paulsmiths.edu wrote: I ended up doing a regular expression find and replace function to replace all illegal xml characters with a dash or something. I was more disappointed in the fact that on the xml creation end, minidom was able to create non-compliant xml files. I assumed that if minidom could make it, it would be compliant but that doesn't seem to be the case. Now I have to add a find and replace function on the creation side to avoid this issue in the future. Good learning experience I guess. Thanks for all your suggestions. Mike Beccaria Systems Librarian Head of Digital Initiative Paul Smith's College 518.327.6376 mbecca...@paulsmiths.edu Become a friend of Paul Smith's Library on Facebook today! -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Chris Beer Sent: Tuesday, March 05, 2013 1:48 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] XML Parsing and Python I'll note that 0x is a UTF-8 non-character, and these noncharacters should never be included in text interchange between implementations. [1] I assume the OCR engine maybe using 0x when it can't recognize a character? So, it's not wrong for a parser to complain (or, not complain) about 0x, and you can just scrub the string like Jon suggests. Chris [1] http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters#Noncharacters On 5 Mar, 2013, at 9:16 , Jon Stroop jstr...@princeton.edu wrote: Mike, I haven't used minidom extensively but my guess is that doc.toprettyxml(indent= ,encoding=utf-8) isn't actually changing the encoding because it can't parse the string in your content variable. I'm surprised that you're not getting tossed a UnicodeError, but The docs for Node.toxml() [1] might shed some light: To avoid UnicodeError exceptions in case of unrepresentable text data, the encoding argument should be specified as utf-8. So what happens if you're not explicit about the encoding, i.e. just doc.toprettyxml()? This would hopefully at least move your exception to a more appropriate place. In any case, one solution would be to scrub the string in your content variable to get rid of the invalid characters (hopefully they're insignificant). Maybe something like this: def unicode_filter(char): try: unicode(char, encoding='utf-8', errors='strict') return char except UnicodeDecodeError: return '' content = 'abc\xFF' content = ''.join(map(unicode_filter, content)) print content Not really my area of expertise, but maybe worth a shot -Jon 1. http://docs.python.org/2/library/xml.dom.minidom.html#xml.dom.minidom. Node.toxml -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu On 03/04/2013 03:00 PM, Michael Beccaria wrote: I'm working on a project that takes the ocr data found in a pdf and places it in a custom xml file. I use Python scripts to create the xml file. Something like this (trimmed down a bit): from xml.dom.minidom import Document doc = Document() Page = doc.createElement(Page) doc.appendChild(Page) f = StringIO(txt) lines = f.readlines() for line in lines: word = doc.createElement(String) ... word.setAttribute(CONTENT,content) Page.appendChild(word) return doc.toprettyxml(indent= ,encoding=utf-8) This creates a file, simply, that looks like this: ?xml version=1.0 encoding=utf-8? Page HEIGHT=3296 WIDTH=2609 String CONTENT=BuffaloLaunch / String CONTENT=Club / String CONTENT=Offices / String CONTENT=Installed / ... /Page I am able to get this document to be created ok and saved to an xml file. The problem occurs when I try and have it read using the lxml library: from lxml import etree doc = etree.parse(filename) I am running across errors like XMLSyntaxError: Char 0x out of allowed range, line 94, column 19. Which when I look at the file, is true. There is a 0X character in the content field. How is a file able to be created using minidom (which I assume would create a valid xml file) and then failing when parsing with lxml? What should I do to fix this on the encoding side so that errors don't show up on the parsing side? Thanks, Mike How is the Mike Beccaria Systems Librarian Head of Digital Initiative Paul Smith's College 518.327.6376 mbecca
Re: [CODE4LIB] Math or the other math?
+1 mostly to the thread Programming seems to me -- just me here -- stratified like any other profession, in particular by access or lack of access to computer science within software dev. There are other factors. But computer science seems now heavily invested in math. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 9:17 AM, Michael Hopwood mich...@editeur.org wrote: You mean discrete mathematics? http://en.wikipedia.org/wiki/Discrete_mathematics I always kicked myself for not taking that course at high school (UK readers, I mean secondary school) but at least I picked up the basics during my physics MSci (a lot of physics these days is coding). Cheers, m -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken Irwin Sent: 27 February 2013 13:53 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance What both Kelly and David say is true here: David: programming needs math, not arithmetic. Kelly: computers are good at arithmetic on their own. To which I'll add: the related skill that I see as necessary here is quantitative reasoning - not the crunching of numbers but the correct assembly of the formulae, articulating the systematization of the problem. What I'm less certain of is what sort of training tend to lead to that sort of conceptual skill. Ken On Feb 27, 2013, at 8:44 AM, David Faler dfa...@tlcdelivers.com wrote: I think math is essential, but what they teach in schools these days isn't math. It's arithmetic. Some intro philosophy courses teach math. I'll stop before I start ranting. On Wed, Feb 27, 2013 at 12:04 AM, Kelly Lucas klu...@isovera.com wrote: On Sat, Feb 23, 2013 at 2:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel -- Kelly R. Lucas Senior Developer Isovera, Inc. klu...@isovera.com http://www.isovera.com http://drupal.org/user/271780 twitter: @bp1101 - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] back to minorities question, seeking guidance
Christina George, hello! and welcome. WR, idly, I wonder whether this intro to programming but-not-for-programmers course might be taught by an underqualified or overworked adjunct or grad student slave, or if not, whether instead by a bored research professor. It doesn't sound like fun. Sympathy. Greetings to all 2292 recipients. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 2/27/13 11:11 AM, George, Christina Rose georg...@umsystem.edu wrote: I think Wilhelmina has touched on an very important point that, for some, in order to learn--or want to learn--something, the material has to be relevant to them. Some folks can get through the boring, calculators can do this parts of because they anticipate the long-term benefit while others learn more effectively if the material helps them achieve a goal they already have or a goal that is within their area of expertise or interest. Christina George (Hi! I'm new to this listserv) -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Wilhelmina Randtke Sent: Wednesday, February 27, 2013 8:47 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] back to minorities question, seeking guidance Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. HTML is called markup language, but does anyone here really think it's a programming language? Even though is gets more complicated over time, it pretty much doesn't have variables or do interactive things, and is for displaying things, not manipulating things. My point about math and programming is that the curriculum for the average intro programming class appears to have been developed circa 1972 and never tweaked. I'm in Programming for Engineers right now, which is the prerequisite for the classes that looked useful. So far we have written lots of small programs to add numbers, find modulos, make a simple loop. All this would have been exciting before calculators. But, yeah, we have calculators now. And, actually, we had calculators before we had widespread access to affordable computers. Writing a page long program to add some numbers makes no sense. It's probably the least efficient way to solve the problem. Nothing about the coursework shows computers as useful at solving problems. Everything about the coursework shows computers as clunky inefficient, difficult to use calculators. And... here is something we haven't done... We have not yet called a function from inside a function. So, the whole object oriented thing has not yet appeared, and it's past midterm time. From having looked at a bunch of syllabi online for different intro level programming classes, I think my experiences are the norm. The intro classes cover things you can do more easily without coding. This type of curriculum is off putting to at least some people. It also isn't necessary. I think it's possible to design a curriculum where students could have something to show that would be worthwhile now, as opposed to worthwhile in 1972 when adding many numbers at once was a big deal. -Wilhelmina Randtke On Sat, Feb 23, 2013 at 1:57 AM, Thomas Krichel kric...@openlib.org wrote: Wilhelmina Randtke writes Pretty much the whole entire entry level programming class for the average class covers using code to do things that you can do much more easily without code. Probably it was the wrong course. I think coding should start with building web pages. A calculator can't do that. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorprofile.org/pkr1 skype: thomaskrichel - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Digital collection backups
We use LOCKSS as part of MetaArchive. LOCKSS as I understand it is typically spec-d for consumer hardware, and so, presumably as a result of SE Asia flooding, there have been some drive failures and cache downtimes and adjustments accordingly. However, that is the worst of it, first. LOCKSS is to some perhaps even considerable degree, tamper-resistant since it relies on mechanisms of collective polling among multiple copies to preserve integrity. This, as opposed to static checksums or some other solution. As such, it seems to me important to run a LOCKSS box with other LOCKSS boxes; MA cooperative specifies six or so, distributed locations for each cache. The economic sustainability of such an enterprise is a valid question. David S H Rosenthal at Stanford seems to lead the charge for this research. e.g. http://blog.dshr.org/2012/08/amazons-announcement-of-glacier.html#more I've heard mention from other players that they watch MA carefully for such sustainability considerations, especially because MA uses LOCKSS for non-journal content. In some sense this may extend LOCKSS beyond its original design. MetaArchive has in my opinion been extremely responsible in designating succession scenarios and disaster recovery scenarios, going to far as to fund, develop and test services for migration out of the system, into an IRODS repository in the initial case. Al Matthews AUC Robert W. Woodruff Library On 1/11/13 9:10 AM, Joshua Welker jwel...@sbuniv.edu wrote: Good point. But since campus IT will be creating regular disaster-recovery backups, the odds that we'd need ever need to retrieve more than a handful of files from Glacier at a time is pretty low. Josh Welker -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Gary McGath Sent: Friday, January 11, 2013 8:03 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Digital collection backups Concerns have been raised about how expensive Glacier gets if you need to recover a lot of files in a short time period. http://www.wired.com/wiredenterprise/2012/08/glacier/ On 1/10/13 5:56 PM, Roy Tennant wrote: I'd also take a look at Amazon Glacier. Recently I parked about 50GB of data files in logical tar'd and gzip'd chunks and it's costing my employer less than 50 cents/month. Glacier, however, is best for park it and forget kinds of needs, as the real cost is in data flow. Storage is cheap, but must be considered offline or near line as you must first request to retrieve a file, wait for about a day, and then retrieve the file. And you're charged more for the download throughput than just about anything. I'm using a Unix client to handle all of the heavy lifting of uploading and downloading, as Glacier is meant to be used via an API rather than a web client.[1] If anyone is interested, I have local documentation on usage that I could probably genericize. And yes, I did round-trip a file to make sure it functioned as advertised. Roy [1] https://github.com/vsespb/mt-aws-glacier On Thu, Jan 10, 2013 at 2:29 PM, ddwigg...@historicnewengland.org wrote: We built our own solution for this by creating a plugin that works with our digital asset management system (ResourceSpace) to invidually back up files to Amazon S3. Because S3 is replicated to multiple data centers, this provides a fairly high level of redundancy. And because it's an object-based web service, we can access any given object individually by using a URL related to the original storage URL within our system. This also allows us to take advantage of S3 for images on our website. All of the images from in our online collections database are being served straight from S3, which diverts the load from our public web server. When we launch zoomable images later this year, all of the tiles will also be generated locally in the DAM and then served to the public via the mirrored copy in S3. The current pricing is around $0.08/GB/month for 1-50 TB, which I think is fairly reasonable for what we're getting. They just dropped the price substantially a few months ago. DuraCloud http://www.duracloud.org/ supposedly offers a way to add another abstraction layer so you can build something like this that is portable between different cloud storage providers. But I haven't really looked into this as of yet. -- Gary McGath, Professional Software Developer http://www.garymcgath.com - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Digital collection backups
http://metaarchive.org/costs in our case. Interested to hear other experiences. Al On 1/11/13 10:01 AM, Joshua Welker jwel...@sbuniv.edu wrote: Thanks, Al. I think we'd join a LOCKSS network rather than run multiple LOCKSS boxes ourselves. Does anyone have any experience with one of those, like the LOCKSS Global Alliance? Josh Welker -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Al Matthews Sent: Friday, January 11, 2013 8:50 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Digital collection backups We use LOCKSS as part of MetaArchive. LOCKSS as I understand it is typically spec-d for consumer hardware, and so, presumably as a result of SE Asia flooding, there have been some drive failures and cache downtimes and adjustments accordingly. However, that is the worst of it, first. LOCKSS is to some perhaps even considerable degree, tamper-resistant since it relies on mechanisms of collective polling among multiple copies to preserve integrity. This, as opposed to static checksums or some other solution. As such, it seems to me important to run a LOCKSS box with other LOCKSS boxes; MA cooperative specifies six or so, distributed locations for each cache. The economic sustainability of such an enterprise is a valid question. David S H Rosenthal at Stanford seems to lead the charge for this research. e.g. http://blog.dshr.org/2012/08/amazons-announcement-of-glacier.html#more I've heard mention from other players that they watch MA carefully for such sustainability considerations, especially because MA uses LOCKSS for non-journal content. In some sense this may extend LOCKSS beyond its original design. MetaArchive has in my opinion been extremely responsible in designating succession scenarios and disaster recovery scenarios, going to far as to fund, develop and test services for migration out of the system, into an IRODS repository in the initial case. Al Matthews AUC Robert W. Woodruff Library On 1/11/13 9:10 AM, Joshua Welker jwel...@sbuniv.edu wrote: Good point. But since campus IT will be creating regular disaster-recovery backups, the odds that we'd need ever need to retrieve more than a handful of files from Glacier at a time is pretty low. Josh Welker -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Gary McGath Sent: Friday, January 11, 2013 8:03 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Digital collection backups Concerns have been raised about how expensive Glacier gets if you need to recover a lot of files in a short time period. http://www.wired.com/wiredenterprise/2012/08/glacier/ On 1/10/13 5:56 PM, Roy Tennant wrote: I'd also take a look at Amazon Glacier. Recently I parked about 50GB of data files in logical tar'd and gzip'd chunks and it's costing my employer less than 50 cents/month. Glacier, however, is best for park it and forget kinds of needs, as the real cost is in data flow. Storage is cheap, but must be considered offline or near line as you must first request to retrieve a file, wait for about a day, and then retrieve the file. And you're charged more for the download throughput than just about anything. I'm using a Unix client to handle all of the heavy lifting of uploading and downloading, as Glacier is meant to be used via an API rather than a web client.[1] If anyone is interested, I have local documentation on usage that I could probably genericize. And yes, I did round-trip a file to make sure it functioned as advertised. Roy [1] https://github.com/vsespb/mt-aws-glacier On Thu, Jan 10, 2013 at 2:29 PM, ddwigg...@historicnewengland.org wrote: We built our own solution for this by creating a plugin that works with our digital asset management system (ResourceSpace) to invidually back up files to Amazon S3. Because S3 is replicated to multiple data centers, this provides a fairly high level of redundancy. And because it's an object-based web service, we can access any given object individually by using a URL related to the original storage URL within our system. This also allows us to take advantage of S3 for images on our website. All of the images from in our online collections database are being served straight from S3, which diverts the load from our public web server. When we launch zoomable images later this year, all of the tiles will also be generated locally in the DAM and then served to the public via the mirrored copy in S3. The current pricing is around $0.08/GB/month for 1-50 TB, which I think is fairly reasonable for what we're getting. They just dropped the price substantially a few months ago. DuraCloud http://www.duracloud.org/ supposedly offers a way to add another abstraction layer so you can build something like this that is portable between different cloud storage providers. But I haven't really looked into this as of yet. -- Gary McGath, Professional Software Developer http
[CODE4LIB] Google Indoor Mapping
Hello list. I hope this finds you well and, dry, and with some power. I'm recently aware of the existence of Google Indoor Mapping which, obviously enough, brings indoor locations (to Google Maps versioned 6.x and higher). The project also offers indoor walking directions. I assume this works via a combination of fine-grained GPS and, some sort of integration with internal wireless. Since a number of you will have had experience with this service, I am soliciting in open forum a discussion of pros, cons, and concerns. Thank you. Al Matthews, Software Dev, Atlanta University Center - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Oral history app and server
Hi all. Thanks Jason for the excellent links. Chrome seems to be out front with this last I looked. After somehow spending an hour reading all this, it seems like audio doesn't work yet, right? Except on Chromium canary on Mac. Which is something. Mozilla's also big into this as well http://mozillapopcorn.org/ https://wiki.mozilla.org/Audio_Data_API . The latter remains Firefox-specific and Mozilla marks it as deprecated. Still, it exists. Android has a speech API http://android-developers.blogspot.com/2010/03/speech-input-api-for-android.html, and implements Media Capture it seems. As a fine alternative, and more general, http://cmusphinx.sourceforge.net/wiki/gstreamer seems like a sane postprocessed example. Dear to me, that last. But doesn't one simplify all this by keeping recording off the cloud and building out the separate components? Record ; send ; speech-to-text ; share and improve . I do like this, Paul, the idea. Al Matthews, Software Dev, Atlanta University Center From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jason Ronallo [jrona...@gmail.com] Sent: Wednesday, October 03, 2012 2:00 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Oral history app and server Paul, You may want to look at WebRTC: http://www.webrtc.org/ Especially getUserMedia which allows for video capture within the browser from a users webcam: http://www.html5rocks.com/en/tutorials/getusermedia/intro/ This is bleeding edge stuff and probably not ready for a real project, but it may be that something like this enables the kind of project you're wanting to do. Chrome seems to be out front with this last I looked. Jason On Tue, Oct 2, 2012 at 8:44 AM, Paul Orkiszewski orkiszews...@appstate.edu wrote: Hi 4libers, Does anyone know of something - a kiosk, an iPad app, a web application - that: - Initiates an oral history interview by getting demographic info and permission to use and stream for scholarly purposes. - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Oral history app and server
Yes. Or else it's a machine learning problem at far side, with speakers organized by, I dunno, geography. Regardless, the models will need training. Al Matthews, AUC Robert W. Woodruff Library 404.978.2057 o 404.769.2617 c - Reply message - From: Gary McGath develo...@mcgath.com To: CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Oral history app and server Date: Wed, Oct 3, 2012 5:06 pm Continuing on this part: My friend says that using any existing speech recognition software won't work at all well for transcribing interviews with a variety of people. All such software needs to be trained to the speaker's voice. A possible alternative is for a designated person to train the software and re-speak it into the speech recognition software. On 10/3/12 6:22 AM, Gary McGath wrote: On 10/2/12 8:44 AM, Paul Orkiszewski wrote: - Processes the audio through speech recognition either in real time or post-interview, and populates the dbase record with rendered text (at whatever level of accuracy) You could do this piece with Dragon; see this post for some discussion: http://www.nuance.com/dragon/transcription-solutions/index.htm A friend of mine is an expert in this area and might be able to answer some questions. -- Gary McGath, Professional Software Developer http://www.garymcgath.com - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] SEC4LIB or Hack, Crack, and Frakk breakout sessions
On this issue, the following paper may be of interest. It contemplates an orderly trade in exploits: http://securityevaluators.com/files/papers/0daymarket.pdf . Thank you, Al Matthews, Software Dev, Atlanta University Center From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Peter Murray [peter.mur...@lyrasis.org] Sent: Friday, April 20, 2012 1:47 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] SEC4LIB or Hack, Crack, and Frakk breakout sessions I remember the related discussion from last month (http://serials.infomotions.com/code4lib/archive/2012/201203/thread.html#777) -- and kudos for bringing it up again -- and I find I'm still of mixed feelings about it. Security is an important aspect of software development, no argument, but I wonder if there is something separate or distinct for libraries about the topic. What I do wonder about, though, is if there is a role for a generic-to-libraries security incident response team that would responsibly take in reports of security problems, work with vendors and/or software developers, and publish outcomes. I could see a need for such a team that was respected in our field and had contacts with people from the vendor community and FOSS projects. Peter On Apr 20, 2012, at 12:35 PM, Erin Germ wrote: At IUG I talked to a few people about security of library services and applications. Becky had mentioned doing a breakout session to discuss security at the next IUG or conference. Would anyone be interested in helping plan a breakout session and discussing security of library services and application? A recent presentation lead me to believe it would also be of great value to have a set of good practices that are very accessible to those who do not have a security, or even IT, background. Or would anyone be interested in forming an informal SEC4LIB discussion group. This would be an informal group to discuss existing security features and shortcomings of library services and applications. Ideally this would include a blend of high and low level skills and knowledge. I am personally interested in documenting known and patched vulnerabilities of current and past library software and services. -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 1438 West Peachtree Street NW Suite 200 Atlanta, GA 30309 Toll Free: 800.999.8558 Fax: 404.892.7879 www.lyrasis.org LYRASIS: Great Libraries. Strong Communities. Innovative Answers. - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Interest in Toronto/GTA Meetup?
Yes, Al Matthews, Software Dev, Atlanta University Center From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joselito Dela Cruz [jdelac...@hodges.edu] Sent: Friday, April 20, 2012 2:06 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Interest in Toronto/GTA Meetup? A Florida Meetup would be nice as well. Thanks, Jay Dela Cruz, MLIS Electronic Resources Librarian Hodges University | 2655 Northbrooke Drive, Naples, FL 34119-7932 (239) 598-6211 | (800) 466-8017 x 6211 | f. (239) 598-6250 jdelac...@hodges.edu | www.hodges.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cynthia Ng Sent: Friday, April 20, 2012 1:50 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Interest in Toronto/GTA Meetup? Hi All, In light of seeing some of the other meetups going on, I thought cool, reminds me of the Web 2.0 meetups I used to have in Ottawa, I wondered why I hadn't heard of one in Toronto. I've been told there isn't one! However, before trying to organize one, I was wondering if there was interest in having a Toronto Meetup? Would be interested in what others think. -Cynthia - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Archivists' Toolkit: Adding Digital Objects via MySQL
Hi. Is there a reason not to attempt this instead through the CLI? Al Matthews, Software Dev, Atlanta University Center From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Rosalyn Metz [rosalynm...@gmail.com] Sent: Wednesday, April 18, 2012 9:23 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Archivists' Toolkit: Adding Digital Objects via MySQL Hi Everyone, I posted this over on the Archivists' Toolkit listserv and got no response (yet), so I thought I might try here as well. I have a large quantity (around 300+) of digital objects that I need to add to Archivists' Toolkit. I think I've figured out what queries I need to run in order to do this in MySQL (rather than the interface) but I wanted to get opinions from the peanut gallery before trying it out on my test instance. It seems that there are actually two update queries that need to be used when creating a Digital Object. They are: insert into ArchDescriptionInstances (instanceType, resourceComponentId, resourceId, parentResourceId, instanceDescriminator, archDescriptionInstancesId) values ('Digital object', 336673, null, 543, 'digital', 22567003) and... insert into DigitalObjects (version, lastUpdated, created, lastUpdatedBy, createdBy, title, dateExpression, dateBegin, dateEnd, languageCode, restrictionsApply, eadDaoActuate, eadDaoShow, metsIdentifier, objectType, label, objectOrder, componentId, parentDigitalObjectId, archDescriptionInstancesId, repositoryId) values (0, '2012-04-17 12:05:15', '2012-04-17 12:05:15', 'username', 'username', 'title', '1938-1959', null, null, '', 0, 'onRequest', 'new', '678.1829', 'text', '', 0, '', null, 22567003, 1) There also appears to be some update queries as well, but I'm guessing that they are less important (please correct me if I'm wrong). Has anyone tried to do this in the past? If so do you have scripts that will create Digital Objects for you that you wouldn't mind sharing? Is there anything you think I should know before testing this out in my test instance of AT? Any caveats for me? Any help anyone can provide would be greatly appreciated. Thanks, Rosalyn - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] old stuff
That seems to me an excellent answer, especially since my question was too broadly set. Thank you. I think what still bothers me is that it requires a trip to ebay, or a vm or two, and some maybe not-quite-trivial forensics generally, to establish whether there is worthwhile data on a disk (or magnetic reel, whatever) for starters. Archives are already in perpetual backlog, and based on some past work I'd say only a leading subset of these have sufficiently technical staff. I'm surprised that hardware-sharing hasn't emerged as an initiative (assuming it already takes place as a service). Thank you, -- Al Matthews, Software Dev, Atlanta University Center -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of David Uspal Sent: Tuesday, March 27, 2012 5:53 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] old stuff Al, I'm not an archivist by trade, but I had some thoughts on the subject, (and the person who sits behind me is, so I bounced my ideas off her to make sure I'm not talking inanities). Anyway, here goes: I think when people look into archiving/storing digital media, they look at it as one question -- is it worthwhile to save/catalog/store this item? To me though, there are really two completely separate questions being asked here: 1.) Is the data on the disk unique or special in a way that makes the data itself (i.e the ones and zeros) valuable. 2.) Is the physical object itself unique or special in any way (including it being a unique copy, marginalia, notable owner, etc) that makes the physical object valuable or makes the item an object d'arte. 2a.) As part of two, if the object itself is not unique or special, is it part of a larger collection or set that is unique or special (a complete collection of first print Sierra games, a disk used in a Cray that was used in some big scientific discovery, etc) Answering yes to one of these will probably incur a completely difference response than if yes was answered to the other. Some generic examples: 1.) I have a 5 1/4 with some of my old high school papers on them. In terms of data value, because it's the only copy of these items, the value of the data is high. Since the disks are generic floppies without significant markings, I'd value the worth of the physical object as low. Therein, best bet would be to transfer the data off using an old 5 1/4 drive and put the data into a more long-term archivable solution (cloud storage, steady state drive, etc). You can see how this example can be used on university or corporate archival materials -- the physical object has much less worth than the data contained therein. 2.) I have a first edition copy of Zork I on 5 1/4 disk (may even have box/instructions/box fluff). Here, the data on the disk is of low value -- there are copies of Zork I all over the internet and I essentially download a copy to my hard drive for free (or even play on my browser if I so choose). On the other hand, its an original copy of Zork I with box/fluff, so the value lies not in the data but the physical object itself. In this example, I would store the disk as per best practices (good tips found here: http://dlis.dos.state.fl.us/archives/preservation/magnetic/index.cfm). 3.) I have a copy of a Final Fantasy cartridge for the original Nintendo. Again, you can get the data pretty readily for a large pool of resources, so the data itself is of little value. Final Fantasy carts are pretty common too, so the value of the object itself is pretty low. On the otherhand, the cart is part of a complete collection of Nintendo cartridges and licensed merchandise, so the value in this object now lies in the fact that it exists within a collection, and has value due to that collection. (Plus, it's always better to play a game on the original machine than play it on your Android, loading screen times notwithstanding...) A similar example would be blank punchcards for an old Sinclair ZX81 -- the cards themselves don't have value, but added to the Sinclair as a complete package they suddenly do. Other items from your post: Hardware: eBay is your best friend. You can rebuild your Tandy 1000 from parts on eBay. You can buy a complete and whole Tandy 1000 on eBay. I buy used car parts all the time on eBay to keep my junkers running, same principle can be applied to most old machines (fun fact: you can still buy parts for a DMC DeLorean on eBay). The only area you'll get stuck is if its media for a machine that REALLY old (much like parts for a very very old car). Software/Emulation: for examples that fall under 1, the good news is a majority of this material will usually be readable/obtainable since emulators for most old machine types already exist, and are almost always free (I just fired up my C64 emulator the other day). The most frequent snag I hear
[CODE4LIB] old stuff
Hello. I have a local question that I will assume to be general: how do those of you involved in special collections and the like - especially in the event that those collections are born digital and perhaps not entirely recent - deal with issues of evaluation of digital assets? One difficult example might be: sharing or procuring a specific kind of technical resource (where an extreme case might be, a 3.5 or 5.25 disk - or suppose it's DOS-era magnetic media, for an alternate challenge) among institutions who aren't prepared to amass collections of such. To me this touches on hardware, software, emulation, expertise and budget issues all at once. Thoughts? Thanks, -- Al Matthews, Software Dev, Atlanta University Center - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] Job: Web Developer Ninja at Springshare
Oui oui! only slackers outside the arrondissements. -- Al Matthews, Software Dev, Atlanta University Center, Atlanta -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Chris Fitzpatrick Sent: Wednesday, March 21, 2012 12:01 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Job: Web Developer Ninja at Springshare I figured it was in Paris since that's where all the ninjas seem to be these days. On Wed, Mar 21, 2012 at 4:46 PM, Lisa H Kurt lk...@unr.edu wrote: Cary, It looks like this is a telecommuting job- location would be anywhere: * Working from home (yes, you heard it right, though slackers need not apply - see the point above about needing to be a self-starter and self-motivator) On 3/21/12 6:49 AM, Cary Gordon listu...@chillco.com wrote: It would be great if job listings could include location, particularly where the work is to be performed onsite. Thanks, Cary On Tue, Mar 20, 2012 at 2:02 PM, j...@code4lib.org wrote: Howdy, code4lib-ers! Springshare ([http://springshare.com](http://springshare.com)) is looking for web developers with mad skills and thirst for innovation. We create web tools that libraries love, and we need your help to carry out our mission of creating awesome web software and providing even awesome-r service to our libraries. This is what we'd need from you: * LAMP skills of the ninja caliber, including: * 3+ years PHP / MySQL experience * Unix / Apache skills * Experience in scaling web infrastructure * Front-end JS programming experience (e.g. jQuery or dojo) * Bonus: worked with Nginx, Mobile tech, or Solr? Experience with any of these is a plus. Worked with all three? Where have you been all our lives?? * You need to be a self-starter and self-motivating type. We work in a typical startup fashion so you'll be wearing many hats and doing a lot of things - at once - hence having great organizational and multitasking skills is essential In a typical week, you'll: * Create front- and back-end interfaces for new or existing products, letting your creative juices run free * Work with our partners (other library-centric companies) to integrate their tools with Springshare and vice versa * Dream up new ideas that will rock the library (software) world * Every one us (including our CEO himself) also helps with support and making sure our customers' needs are taken care of, so you'll be talking with our customers regularly, troubleshooting bug fixes and such We offer: * Great pay and benefits (health, dental, 401K, etc.) * Very flexible vacations/time off policy * Working from home (yes, you heard it right, though slackers need not apply - see the point above about needing to be a self-starter and self-motivator) * A very supportive, library-centric environment (half of our team is librarians). If this sounds like your dream gig, please send your resume to sa...@springshare.com and let us know what makes you awesome. Brought to you by code4lib jobs: http://jobs.code4lib.org/job/864/ -- Cary Gordon The Cherry Hill Company http://chillco.com - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **
Re: [CODE4LIB] jobs.code4lib.org
Hello list. +1, albeit from afar. -- Al Matthews, Software Dev, Atlanta University Center -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Corey A Harper Sent: Wednesday, February 01, 2012 12:50 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] jobs.code4lib.org I should repost the reply I sent on the c4lcon list here: Hey Ed, Thanks for posting this summary here. It's really cool to see a description of how this is working. I think this is a pretty good example of how a library data mgt interface of the future might work: * Grab some free text describing a thing; * Try to clean it up, extract important concepts / themes topics * Reconcile against some sort lod-lam friendly controlled vocabulary/ies * Offer cataloger types an interface to accept / reject / refine those mappings as well as the text of the metadata itself. I would go to a breakout session about this. Best, -Corey On Wed, Feb 1, 2012 at 9:46 AM, Cynthia Ng cynthia.s...@gmail.com wrote: Just a quick 2 cents. I only found out about the feed by reading this conversation. I think it would be great to make the RSS link a little more obvious from the front page. -Cynthia On Wed, Feb 1, 2012 at 8:22 AM, Michael J. Giarlo leftw...@alumni.rutgers.edu wrote: I smell a potential breakout session. -Mike P.S. No, really, jokers, that's what I smell. On Jan 31, 2012 11:30 PM, Ed Summers e...@pobox.com wrote: I guess it's rarely a good idea to respond to your own post, but I forgot to add that when a job is published on jobs.code4lib.org it will show up in the site's Atom feed [1]. The feed should be usable by your feed reader of choice, and could also be useful if you want to syndicate the jobs elsewhere. //Ed [1] http://jobs.code4lib.org/feed/ PS. It was kind of fun to finally use the tag link relation to mark up the job tags in the feed with Freebase URLs. For example: entry ... link rel=tag title=Unix href=http://www.freebase.com/view/en/unix; type=text/html / link rel=tag title=Unix [JSON] href=http://www.freebase.com/experimental/topic/standard/en/unix; type=application/json / link rel=tag title=Unix [RDF] href=http://rdf.freebase.com/rdf/en.unix; type=application/rdf+xml / /entry -- Corey A Harper Metadata Services Librarian New York University Libraries 20 Cooper Square, 3rd Floor New York, NY 10003-7112 212.998.2479 corey.har...@nyu.edu - ** The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email in error please notify the system manager or the sender immediately and do not disclose the contents to anyone or make copies. ** IronMail scanned this email for viruses, vandals and malicious content. ** **