[CODE4LIB] Open Repositories 2012 Registration Now Open for Programme/Workshops
[posted on behalf of the organising committee, with apologies for cross-postings:] Dear Colleagues, We are pleased to announce that registration for OR2012 workshops on Monday 9th and Tuesday 10th July is open and available at www.or2012.eventbrite.com. Please note that the workshops are free but you will need the EDN number from your receipt from ePay registration system (http://or2012.ed.ac.uk/registration/) to register for the workshops. Accommodation is booked separately - http://or2012.ed.ac.uk/delegates/accommodation/. The early booking rate of £295 has been extended until 11 June 2012 with a full-price rate of £350 thereafter. There is also a day rate of £115. The OR2012 Programme Committee are also pleased to announce that Dr Cameron Neylon, recently appointed Director of Advocacy at the Public Library of Science (PLoS) will provide the opening keynote for OR2012. Cameron is well known in the scientific community, recognized for his professionalism, experience, vision and influence in scholarly publishing, communication, and research. His attendance at the Budapest Open Access Initiative meeting, advisory role for the Scholarly Communication in Africa Program, and leadership of the Open Society Foundation funded Beyond Impact project are recent examples of his focus on web technology to enhance research communication. After earning his PhD in Chemistry from Australian National University, Cameron worked as a Wellcome Trust International Fellow at the University of Bath; a lecturer in chemical biology at the University of Southampton; and most recently a senior scientist at the Science and Technology Facilities Council, UK. He has also served as an academic editor for PLoS ONEone of many commitments he will transition prior to joining PLoS in early July 2012. The OR2012 programme, including User Group Session details, is available at http://or2012.ed.ac.uk/programme/. Additional information about the conference is provided on the OR2012 website - http://or2012.ed.ac.uk/. We look forward to welcoming you to Edinburgh for OR2012. Best, Stuart Macdonald On behalf of OR2012 Organising Committee === Dr John B Howard, University Librarian and Adjunct Professor, UCD School of Computer Science and Informatics UCD James Joyce Library University College Dublin Belfield Dublin 4 Ireland An Dr John B Howard, Leabharlannaí Ollscoile, agus Ollamh Adjunct, Scoil na Ríomheolaíochta agus na Faisnéisíochta UCD Leabharlann James Joyce UCD, An Coláiste Ollscoile, Baile Átha Cliath, Belfield, Baile Átha Cliath 4, Éire t: +353 1 716 7067 f:+353 1 283 7667 john.b.how...@ucd.ie http://www.ucd.ie/research/people/library/drjohnhoward/
Re: [CODE4LIB] archiving a wiki
On Tue, May 22, 2012 at 11:04 PM, Carol Hassler carol.hass...@wicourts.gov wrote: My organization would like to archive/export our internal wiki in some kind of end-user friendly format. The concept is to copy the wiki contents annually to a format that can be used on any standard computer in case of an emergency (i.e. saved as an HTML web-style archive, saved as PDF files, saved as Word files). take a look at wikiteam their activity is mainly related to mediawiki maybe they could help in a solution for your wiki http://archiveteam.org/index.php?title=WikiTeam http://code.google.com/p/wikiteam/ ciao -- raffaele
Re: [CODE4LIB] archiving a wiki
Many organizations are using Archive-It, the Internet Archive's service for harvesting and preserving specific websites. I think it can be used to produce public or private archives. http://www.archive-it.org/ Keith On Tue, May 22, 2012 at 5:04 PM, Carol Hassler carol.hass...@wicourts.gov wrote: My organization would like to archive/export our internal wiki in some kind of end-user friendly format. The concept is to copy the wiki contents annually to a format that can be used on any standard computer in case of an emergency (i.e. saved as an HTML web-style archive, saved as PDF files, saved as Word files).
Re: [CODE4LIB] archiving a wiki
I haven't tried it on a wiki, but the command-line Unix utility wget can be used to mirror a website. http://www.gnu.org/software/wget/manual/html_node/Advanced-Usage.html I usually call it like this: wget -m -p http://www.site.com/ common flags: -m = mirroring on/off -p = page_requisites on/off -c = continue - when download is interrupted -l5 = reclevel - Recursion level (depth) default = 5 On Tue, May 22, 2012 at 5:04 PM, Carol Hassler carol.hass...@wicourts.govwrote: My organization would like to archive/export our internal wiki in some kind of end-user friendly format. The concept is to copy the wiki contents annually to a format that can be used on any standard computer in case of an emergency (i.e. saved as an HTML web-style archive, saved as PDF files, saved as Word files). Another way to put it is that we are looking for a way to export the contents of the wiki into a printer-friendly format - to a document that maintains some organization and formatting and can be used on any standard computer. Is anybody aware of a tool out there that would allow for this sort of automated, multi-page export? Our wiki is large and we would prefer not to do this type of backup one page at a time. We are using JSPwiki, but I'm open to any option you think might work. Could any of the web harvesting products be adapted to do the job? Has anyone else backed up a wiki to an alternate format? Thanks! Carol Hassler Webmaster / Cataloger Wisconsin State Law Library (608) 261-7558 http://wilawlibrary.gov/
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting of Boston City Council ?... [see other message]
[CODE4LIB] code4lib New England - Call for Proposals
The planning process has begun for the New England regional code4lib conference in October, and we are soliciting proposals for: (a) Prepared talks (20 minutes) (b) Lightning talks (5 minutes) (c) Posters Dates: Friday, October 26 and Saturday, October 27 Location: Yale University, New Haven, CT Proposal deadline: July 15, 2012. This will be a great opportunity to meet your peers at local institutions and generate conversation on code4lib related topics in which you are interested! About the venue: http://wiki.code4lib.org/index.php/Information_about_meeting_rooms_and_available_equipment To submit a proposal, fill out the form code4lib New England - Call for Proposals at: https://docs.google.com/spreadsheet/viewform?formkey=dEQ5SEF4aXljTU5jZFN0UDRsSnJPb2c6MQ If you are interested in making multiple proposals, e.g. for both a prepared talk and a poster, please submit separate proposal forms. Proposal deadline: July 15, 2012. Go forth and propose topics! - the code4lib NE planning team
[CODE4LIB] A harvesting question
Hi dear list, Can anyone give me an example of harvesting PubMed publications from a specific institution to DSpace? Could you help me to configure the harvesting setting under Collection-Harvesting-Content Source in DSpace: Content source: This collection harvests its content from an external source. OAI Provider:__?? (PubMed) OAI Set id: Specific sets_?? (for a specific institution) Metadata Format: Simple Dublin Core [or] DSpace Intermediate Metadata Content being harvested: Harvest metadata and bitstreams (requires ORE support) By the way, we've been downloading xml data directly from the PubMed website and transform it to DCXML using some local VBscript. Then we export the DCXML file to Excel, transform Excel to SIP packages using BloomaMohan's program. We add several additional fields to the data set and do quite some editing in the Excel file. I have been wondering whether the DSpace built-in harvesting will be a much better option. Thank you for any idea or help! Sophie
Re: [CODE4LIB] archiving a wiki
And while this is veering off-topic, it's also worth noting that the development version of wget has support for WARC, the website archiving format that the wayback machine is based around. On 12-05-23 8:27 AM, Tom Keays tomke...@gmail.com wrote: I haven't tried it on a wiki, but the command-line Unix utility wget can be used to mirror a website. http://www.gnu.org/software/wget/manual/html_node/Advanced-Usage.html I usually call it like this: wget -m -p http://www.site.com/ common flags: -m = mirroring on/off -p = page_requisites on/off -c = continue - when download is interrupted -l5 = reclevel - Recursion level (depth) default = 5 On Tue, May 22, 2012 at 5:04 PM, Carol Hassler carol.hass...@wicourts.govwrote: My organization would like to archive/export our internal wiki in some kind of end-user friendly format. The concept is to copy the wiki contents annually to a format that can be used on any standard computer in case of an emergency (i.e. saved as an HTML web-style archive, saved as PDF files, saved as Word files). Another way to put it is that we are looking for a way to export the contents of the wiki into a printer-friendly format - to a document that maintains some organization and formatting and can be used on any standard computer. Is anybody aware of a tool out there that would allow for this sort of automated, multi-page export? Our wiki is large and we would prefer not to do this type of backup one page at a time. We are using JSPwiki, but I'm open to any option you think might work. Could any of the web harvesting products be adapted to do the job? Has anyone else backed up a wiki to an alternate format? Thanks! Carol Hassler Webmaster / Cataloger Wisconsin State Law Library (608) 261-7558 http://wilawlibrary.gov/
[CODE4LIB] Librarian Job opening at U.S. State Department
There will also be a table at ALA, in the Placement Center from 10:30-12:00 on Sunday, June 24, if you want to talk to someone from the State Department about it. http://careers.state.gov/specialist/vacancy-announcements/iro *VACANCY ANNOUNCEMENT* *United States Department of State* How to Applyhttp://careers.state.gov/specialist/vacancy-announcements/iro#howtoapply An Equal Opportunity Employer *Announcement Number:* IRO-2012-0001 *Position Title:* Foreign Service Information Resource Officer *Open Period:* 05/23/2012 to 07/06/2012 *Series/Grade:* FP-03 *Salary:* $65,413 – $96,061 *Promotion Potential:* FP-01 *Duty Locations:* Vacancies Throughout the World *For More Info:* HR/REE, 202-203-5173, irovacancyi...@state.gov Who May Apply All potential applicants are strongly urged to read this entire Vacancy Announcement to ensure that they meet all of the requirements for this position before applying. Applicants must be American Citizens and at least 20 years old to apply. They must be at least 21 years of age to be appointed. By law, all career candidates must be appointed to the Foreign Service prior to the month in which they reach age 60. Applicants are not eligible to reapply until one year after the application date of prior announcements, provided there is a new open Vacancy Announcement at that time. Duration Appointment Permanent Marketing Statement The Department of State is developing a rank-order list of eligible hires to fill a limited number of Foreign Service Information Resource Officer vacancies. The specific number to be hired will depend on the needs of the Foreign Service. Grade and Starting Salary Range: FP-03, $65,413 - $96,061 Summary The Department of State’s Bureau of International Information Programs (IIP) is the principal international strategic communications service for the foreign affairs community. Talented IIP staff design, develop, and implement a variety of information initiatives created strictly for key international audiences, such as the media, government officials, opinion leaders, and academia in more than 160 countries around the world. IIP prides itself on using cutting-edge technology and strategic alliances to produce information products and services, including Websites, Webcasts and Web chats using various social media platforms, electronic journals, speaker programs, and print publications uniquely designed to support State Department initiatives, as well as those of other U.S. foreign policy organizations. Through its corps of 30 Foreign Service Information Resource Officers (IROs), in the Office of American Spaces, IIP provides professional direction and guidance to 182 Information Resource Centers (IRCs) located at U.S. embassies abroad, and for approximately 600 partnerships with local institutions around the world who host American Corners and Binational Centers. The headquarters office in Washington establishes overall program policy, and provides technical and administrative support, centralized acquisition of electronic information resources, and centralized training programs. Most IROs assigned overseas have regional positions. IROs work closely with the IRC programs at their home posts and make regular visits to IRCs within their areas of regional responsibility, as well as to our partner institutions. IRO regions may include five to ten countries. IROs are presently assigned to the following home posts: Abu Dhabi, Abuja, Accra, Almaty, Baghdad, Bangkok, Beijing, Belgrade, Brasilia, Buenos Aires, Cairo, Dakar, Jakarta, Kabul, Kigali, Mexico City, Moscow, Nairobi, New Delhi, Pretoria, Rome, Tokyo, Vienna, Warsaw, and Washington, D.C. All entry-level assignments are two years. Subsequent overseas tours are two to three years, depending on the country. All IROs should expect to do at least one DC assignment during their careers. While the preference of an applicant for a particular post or area of assignment is given every possible consideration, assignments are dictated by the needs of the Foreign Service. Key Requirements All applicants, in order to be considered for selection, must: - Be a U.S. citizen. - Be at least 20 years old to apply and at least 21 years of age to be appointed. By law (Foreign Service Act of 1980), all career candidates (except for preference-eligible veterans) must be appointed to the Foreign Service prior to the month in which they reach age 60. - Be available for worldwide service. - Be able to obtain a Top Secret Security Clearance. - Be able to obtain an appropriate medical clearance for Foreign Service work. - Be able to obtain a Suitability Clearance, based on a review of the candidate's record for conduct in accordance with suitability standards defined in Chapter 3 of the Foreign Affairs Manual. For more details see http://careers.state.gov/specialist/selection-process or http://www.state.gov/m/a/dir/regs/fam. Major Duties Foreign Service IROs provide
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
Looks like baloney to me. I wouldn't touch it. Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of don warner saklad Sent: Wednesday, May 23, 2012 10:44 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting of Boston City Council ?... [see other message]
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
Is OCLC controlling sandwich meats now? Where will it end? The correct answer to the original question is - go to court. On May 23, 2012 1:27 PM, LeVan,Ralph le...@oclc.org wrote: Looks like baloney to me. I wouldn't touch it. Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of don warner saklad Sent: Wednesday, May 23, 2012 10:44 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting of Boston City Council ?... [see other message]
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
It seems like you'd be better served with a broader community for this sort of question. You might want to ask it over at superuser.com or somewhere like that. -Ross. On May 23, 2012, at 10:44 AM, don warner saklad wrote: How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting of Boston City Council ?... [see other message]
[CODE4LIB] MARC Magic for file
I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
On Wed, May 23, 2012 at 03:28:56PM -0400, Ross Singer wrote: Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... My OS lists it as `data` $ cd $ ls devid_rsa.pub laflin marc orthancssh updating $ ftp http://drupal.org/files/issues/5_records_utf8.mrc_.txt Trying 140.211.166.6... Requesting http://drupal.org/files/issues/5_records_utf8.mrc_.txt 100% |**| 5965 00:00 5965 bytes received in 0.00 seconds (1.56 MB/s) $ ls 5_records_utf8.mrc_.txt id_rsa.pub marc ssh dev laflin orthanc updating $ mkdir test $ mv 5_records_utf8.mrc_.txt test/ $ cd test/ $ mv 5_records_utf8.mrc_.txt 5_records_utf8.mrc $ ls 5_records_utf8.mrc $ file 5_records_utf8.mrc 5_records_utf8.mrc: data $ ls 5_records_utf8.mrc $ ls -al total 32 drwxr-xr-x 2 kayiwa kayiwa 512 May 23 14:34 . drwxr-xr-x 10 kayiwa kayiwa 512 May 23 14:34 .. -rw-r--r-- 1 kayiwa kayiwa 5965 May 23 14:33 5_records_utf8.mrc $ uname -a OpenBSD orthanc.lib.uic.edu 5.1 GENERIC.MP#256 i386 ./fxk -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC -- If builders built buildings the way programmers wrote programs, then the first woodpecker to come along would destroy civilization.
Re: [CODE4LIB] MARC Magic for file
Does it work for bulk files? -- It passed on a file containing 215 MARC Bibs and on a file containing 2,574 MARC Auth records. Don't know if you consider these bulk, but there is more than 1 record in each file (caveat: file stops after evaluating the first line, so of the 2,574 Auth records, the last 2,573 could be invalid). It failed on a file containing all of LC Classification. I need to figure out why. Kevin, do you have examples of the output? -- I received MARC21 Bibliography and MARC21 Authority respectively. In theory, if Leader 20-23 are not 4500 then (non-conforming) should be appended to the identification. If requested, the mimetype - application/marc - should also be outputted. Rgds, Kevin -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ross Singer Sent: Wednesday, May 23, 2012 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC Magic for file Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi- bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=1 12728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
I have become recently unpleasantly aquainted with the world of Marc that is not Marc21, but is ISO 2709. What'll it do on ISO 2709? I might be able to dig up an example. I wonder if it'll claim it's Marc21 (not), or if it's Marc21 Non-confirming (no, it's not quite that either. It's ISO-2709 MARC that's not Marc21). If it just doens't know anything about it and says 'data', that's just fine, if it knows about Marc21 but not non-Marc21 ISO 2709. On 5/23/2012 3:48 PM, Ford, Kevin wrote: Does it work for bulk files? -- It passed on a file containing 215 MARC Bibs and on a file containing 2,574 MARC Auth records. Don't know if you consider these bulk, but there is more than 1 record in each file (caveat: file stops after evaluating the first line, so of the 2,574 Auth records, the last 2,573 could be invalid). It failed on a file containing all of LC Classification. I need to figure out why. Kevin, do you have examples of the output? -- I received MARC21 Bibliography and MARC21 Authority respectively. In theory, if Leader 20-23 are not 4500 then (non-conforming) should be appended to the identification. If requested, the mimetype - application/marc - should also be outputted. Rgds, Kevin -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ross Singer Sent: Wednesday, May 23, 2012 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC Magic for file Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi- bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=1 12728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
On 24/05/12 07:14, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). A couple of warnings about the unix file command (a) it only looks at the start of the file. This is great because it works fast on big files. This is dreadful because it can't warn you that everything after the first 10k of a 2GB file is corrupt or a 1k MARC file is pre-pended to a 400GB astronomy data file. (b) it is not uncommon for a file to match multiple file types. This can cause problems when using file to check whether inputs to a program are actually the type the program is expecting. (c) some platforms have been notoriously slow to add new definitions, ubuntu is not such a platform. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] MARC Magic for file
Don't know what to say. Crawling through the source for file at [1], the pattern matching code as in place as of Sept 2011. It could be present earlier than Sept 2011, but I stopped hunting for it. The earliest it would have made its way into the magic db would have been April 2011. Perhaps OpenBSD is using some custom branch of file, haven't updated the db, etc. Yours, Kevin On 05/23/2012 03:36 PM, Francis Kayiwa wrote: On Wed, May 23, 2012 at 03:28:56PM -0400, Ross Singer wrote: Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... My OS lists it as `data` $ cd $ ls devid_rsa.pub laflin marc orthancssh updating $ ftp http://drupal.org/files/issues/5_records_utf8.mrc_.txt Trying 140.211.166.6... Requesting http://drupal.org/files/issues/5_records_utf8.mrc_.txt 100% |**| 5965 00:00 5965 bytes received in 0.00 seconds (1.56 MB/s) $ ls 5_records_utf8.mrc_.txt id_rsa.pub marc ssh dev laflin orthanc updating $ mkdir test $ mv 5_records_utf8.mrc_.txt test/ $ cd test/ $ mv 5_records_utf8.mrc_.txt 5_records_utf8.mrc $ ls 5_records_utf8.mrc $ file 5_records_utf8.mrc 5_records_utf8.mrc: data $ ls 5_records_utf8.mrc $ ls -al total 32 drwxr-xr-x 2 kayiwa kayiwa 512 May 23 14:34 . drwxr-xr-x 10 kayiwa kayiwa 512 May 23 14:34 .. -rw-r--r-- 1 kayiwa kayiwa 5965 May 23 14:33 5_records_utf8.mrc $ uname -a OpenBSD orthanc.lib.uic.edu 5.1 GENERIC.MP#256 i386 ./fxk -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
It failed on a file containing all of LC Classification. I need to figure out why. -- To reply to myself: Having looked at the file db pattern source [1], I see that the file maintainer introduced a typo into the matching pattern for correctly identifying Classification records. That's way it's failing for Class records. Over and out, Kevin [1] ftp://ftp.astron.com/pub/file/ On 05/23/2012 03:48 PM, Ford, Kevin wrote: Does it work for bulk files? -- It passed on a file containing 215 MARC Bibs and on a file containing 2,574 MARC Auth records. Don't know if you consider these bulk, but there is more than 1 record in each file (caveat: file stops after evaluating the first line, so of the 2,574 Auth records, the last 2,573 could be invalid). It failed on a file containing all of LC Classification. I need to figure out why. Kevin, do you have examples of the output? -- I received MARC21 Bibliography and MARC21 Authority respectively. In theory, if Leader 20-23 are not 4500 then (non-conforming) should be appended to the identification. If requested, the mimetype - application/marc - should also be outputted. Rgds, Kevin -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ross Singer Sent: Wednesday, May 23, 2012 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC Magic for file Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi- bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=1 12728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
On May 23, 2012, at 4:22 PM, Kevin Ford wrote: Don't know what to say. Crawling through the source for file at [1], the pattern matching code as in place as of Sept 2011. It could be present earlier than Sept 2011, but I stopped hunting for it. The earliest it would have made its way into the magic db would have been April 2011. Perhaps OpenBSD is using some custom branch of file, haven't updated the db, etc. As Stuart pointed out, some implementations are slow to update the db. OSX, for example, also just says data (hence my question on the output). -Ross. Yours, Kevin On 05/23/2012 03:36 PM, Francis Kayiwa wrote: On Wed, May 23, 2012 at 03:28:56PM -0400, Ross Singer wrote: Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... My OS lists it as `data` $ cd $ ls devid_rsa.pub laflin marc orthancssh updating $ ftp http://drupal.org/files/issues/5_records_utf8.mrc_.txt Trying 140.211.166.6... Requesting http://drupal.org/files/issues/5_records_utf8.mrc_.txt 100% |**| 5965 00:00 5965 bytes received in 0.00 seconds (1.56 MB/s) $ ls 5_records_utf8.mrc_.txt id_rsa.pub marc ssh dev laflin orthanc updating $ mkdir test $ mv 5_records_utf8.mrc_.txt test/ $ cd test/ $ mv 5_records_utf8.mrc_.txt 5_records_utf8.mrc $ ls 5_records_utf8.mrc $ file 5_records_utf8.mrc 5_records_utf8.mrc: data $ ls 5_records_utf8.mrc $ ls -al total 32 drwxr-xr-x 2 kayiwa kayiwa 512 May 23 14:34 . drwxr-xr-x 10 kayiwa kayiwa 512 May 23 14:34 .. -rw-r--r-- 1 kayiwa kayiwa 5965 May 23 14:33 5_records_utf8.mrc $ uname -a OpenBSD orthanc.lib.uic.edu 5.1 GENERIC.MP#256 i386 ./fxk -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
Re: [CODE4LIB] MARC Magic for file
On Wed, May 23, 2012 at 04:34:47PM -0400, Ross Singer wrote: On May 23, 2012, at 4:22 PM, Kevin Ford wrote: Don't know what to say. Crawling through the source for file at [1], the pattern matching code as in place as of Sept 2011. It could be present earlier than Sept 2011, but I stopped hunting for it. The earliest it would have made its way into the magic db would have been April 2011. Perhaps OpenBSD is using some custom branch of file, haven't updated the db, etc. As Stuart pointed out, some implementations are slow to update the db. OSX, for example, also just says data (hence my question on the output). adding FreeBSD's magicfile from this commit on a users $HOME http://lists.freebsd.org/pipermail/svn-src-vendor/2011-October/000851.html For my next trick I will try to remember that I need to do that. ./fxk -- If builders built buildings the way programmers wrote programs, then the first woodpecker to come along would destroy civilization.
Re: [CODE4LIB] MARC Magic for file
The file format magic format magic changed between versions; I think the OSX version was not compatible with more up to date versions (in the original thread, this caused me some confusion). Simon On Wed, May 23, 2012 at 4:34 PM, Ross Singer rossfsin...@gmail.com wrote: On May 23, 2012, at 4:22 PM, Kevin Ford wrote: Don't know what to say. Crawling through the source for file at [1], the pattern matching code as in place as of Sept 2011. It could be present earlier than Sept 2011, but I stopped hunting for it. The earliest it would have made its way into the magic db would have been April 2011. Perhaps OpenBSD is using some custom branch of file, haven't updated the db, etc. As Stuart pointed out, some implementations are slow to update the db. OSX, for example, also just says data (hence my question on the output). -Ross. Yours, Kevin On 05/23/2012 03:36 PM, Francis Kayiwa wrote: On Wed, May 23, 2012 at 03:28:56PM -0400, Ross Singer wrote: Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... My OS lists it as `data` $ cd $ ls devid_rsa.pub laflin marc orthancssh updating $ ftp http://drupal.org/files/issues/5_records_utf8.mrc_.txt Trying 140.211.166.6... Requesting http://drupal.org/files/issues/5_records_utf8.mrc_.txt 100% |**| 5965 00:00 5965 bytes received in 0.00 seconds (1.56 MB/s) $ ls 5_records_utf8.mrc_.txt id_rsa.pub marc ssh dev laflin orthanc updating $ mkdir test $ mv 5_records_utf8.mrc_.txt test/ $ cd test/ $ mv 5_records_utf8.mrc_.txt 5_records_utf8.mrc $ ls 5_records_utf8.mrc $ file 5_records_utf8.mrc 5_records_utf8.mrc: data $ ls 5_records_utf8.mrc $ ls -al total 32 drwxr-xr-x 2 kayiwa kayiwa 512 May 23 14:34 . drwxr-xr-x 10 kayiwa kayiwa 512 May 23 14:34 .. -rw-r--r-- 1 kayiwa kayiwa 5965 May 23 14:33 5_records_utf8.mrc $ uname -a OpenBSD orthanc.lib.uic.edu 5.1 GENERIC.MP#256 i386 ./fxk -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi-bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=112728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
[CODE4LIB] Job: Numeric and Geospatial Data Services Librarian at Cornell University
The Cornell Institute for Social and Economic Research (CISER) seeks an innovative, collaborative, and service-oriented numeric and geospatial data services librarian. CISER has strengths in social and economic data, computational infrastructure, secure data storage and remote access, and a robust data-sharing environment including a data archive that dates to 1981. The successful applicant will spearhead efforts to further develop CISER's data-rich environment, notably via enhancements to CISER's Data Archive and by participating in data-intensive collaborations with researchers and other librarians on campus. Cornell social scientists are on the cutting edge of interdisciplinary research questions using complex data resources. Growing recognition of the value of interdisciplinary and data-driven research affords CISER and its Cornell partners with opportunities to support numeric and geospatial research and a lifecycle approach to data management. The individual in this position will play a key role in anticipating, developing, and implementing services to support these activities. The CISER Numeric and Geospatial Data Librarian position will encompass many skills, capacities, and knowledge areas: **Responsibilities** Manage and Enhance CISER's Data Archive: * Develop and manage the collection of social and economic datasets in CISER's Data Archive, including acquisition of new datasets. * Lead CISER working group to design and implement ongoing structural improvements and innovations in the CISER Data Archive. * Contribute to efforts to promote and enable the adoption of metadata standards in the social sciences. Provide and Collaborate on Research Data Management across Cornell: * Serve on the consultants group of Cornell's Research Data Management Services Group (RDMSG) to assist social science researchers in the writing and implementation of grant-required data management plans. * Serve on selected RDMSG implementation teams, per the RDMSG Management Council. Establish and Grow CISER's Data and GIS Outreach: * Provide training and support in the use of numeric, geospatial, and GIS data * Provide mapping strategies for large and/or complex sets of data * Locate and acquire digital numeric, geospatial and GIS data sets and associated software * Support data management efforts for numeric and geospatial data * Expand data curation services to include numeric, geospatial and GIS data sets * Liaise and form productive partnerships with other numeric and geospatial experts on campus, including Cornell University Library (CUL), the Cornell University Geospatial Information Repository (CUGIR), and the Program on Applied Demography (PAD). Maintain and Expand Outreach and Professional Engagement: * Promote the resources and services of the CISER Data Archive through announcements, presentations, webinars, etc., to the Cornell community. * Participate in the writing of proposals to funding agencies and in the execution of accepted proposals for innovations in social science data management. * Establish and maintain professional relationships between CISER and other organizations and individuals engaged in the pursuit of similar or complementary goals. * Conduct, present, and publish applied research in areas related to discovery, organization, and usability of social science data resources across their lifecycle. Coordinate with Other CISER Data Services Professionals in the provision of all aspects of data-related services to the Cornell community and external users. **Qualifications** The successful candidate will be a dynamic individual with a strong proclivity to explore new methods for enhancing the services CISER provides to researchers. She or he will not fear experimentation, innovation, change, success, or occasional failure in developing research data management services and will understand how to build successful teams and how to coach staff as they build new skills. She or he will thrive in a field that will continue to experience constant change and will be comfortable to leave behind methods that no longer support the goals and mission of a social science research support unit. Required Qualifications * Master's degree in Library and Information Science or relevant social science discipline. * Minimum of two years experience working with large social science datasets and familiarity with major data resources such as ICPSR and the U.S. Census Bureau. * Knowledge and experience with social science data and statistical software such as SAS, Stata, SPSS, or R. * Knowledge or experience of GIS data and software, such as ArcGIS. * Demonstrated evidence of excellent communications and interpersonal skills. Preferred Qualifications * Experience with metadata practices and standards such as DDI, OAI-PMH, MODS, METS, PREMIS, FGDC or MARC. * Knowledge of qualitative data analysis software such as
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
On Wed, May 23, 2012 at 11:37 AM, Simon Spero sesunc...@gmail.com wrote: Is OCLC controlling sandwich meats now? Where will it end? Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. Perhaps that should be the Baloney Stamp of Disapproval? Can I get a 1+? Roy
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
I think this is the output from a court stenography machine, which produces a 'shorthand' form of language. See http://www.stenograph.com/ to see if there are tools to translate it out to natural language. Bob Sandusky On 5/23/2012 12:27 PM, LeVan,Ralph wrote: Looks like baloney to me. I wouldn't touch it. Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of don warner saklad Sent: Wednesday, May 23, 2012 10:44 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting of Boston City Council ?... [see other message]
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
Hi Roy, Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. We should be thinking ahead to future use cases. I say go for a broader Cured Meats Stamp of Approval. Or perhaps Charcuterie to lend it some class. To do otherwise could lead to a proliferation of stamps. -- Michael -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy Tennant Sent: Wednesday, May 23, 2012 4:16 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message] On Wed, May 23, 2012 at 11:37 AM, Simon Spero sesunc...@gmail.com wrote: Is OCLC controlling sandwich meats now? Where will it end? Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. Perhaps that should be the Baloney Stamp of Disapproval? Can I get a 1+? Roy
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
Watch out... MEATadata jokes up ahead... -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Doran, Michael D Sent: 23 May 2012 22:41 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message] Hi Roy, Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. We should be thinking ahead to future use cases. I say go for a broader Cured Meats Stamp of Approval. Or perhaps Charcuterie to lend it some class. To do otherwise could lead to a proliferation of stamps.
Re: [CODE4LIB] MARC Magic for file
On Wed, May 23, 2012 at 12:14 PM, Ford, Kevin k...@loc.gov wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I'm not sure whether to laugh or cry that it's a sign of progress that a 40 year old utility designed to identify file types is now just beginning to be able to recognize a format that's been around for almost 50 years... kyle -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edubaner...@orbiscascade.org / 503.999.9787
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
On Wed, May 23, 2012 at 02:15:53PM -0700, Roy Tennant wrote: On Wed, May 23, 2012 at 11:37 AM, Simon Spero sesunc...@gmail.com wrote: Is OCLC controlling sandwich meats now? Where will it end? Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. Perhaps that should be the Baloney Stamp of Disapproval? Can I get a 1+? I am not pleased with one of my favorite meats being used to show disapproval so hard to give you my 1+ How about a useless FB `like`? ./fxk Roy -- It is always the best policy to tell the truth, unless, of course, you are an exceptionally good liar. -- Jerome K. Jerome
Re: [CODE4LIB] How do you get plain language, plain English out of the .sgstn stenograph stenonote record of the public meeting?... [see other message]
Salvete! Since we already control the Bacon Stamp of Approval, baloney seems like the next logical step. We should be thinking ahead to future use cases. I say go for a broader Cured Meats Stamp of Approval. Or perhaps Charcuterie to lend it some class. To do otherwise could lead to a proliferation of stamps. Clearly this calls for an Index Meaticus. *ducks* Brooke