Re: [CODE4LIB] marc in json

2011-12-02 Thread Ed Summers
Thanks for all the helpful guidance. I'll work on getting the JSON
implementation updated before releasing it.

I don't know if it's of interest but the Twitter firehose (as deliverd
by Gnip) is line oriented JSON. Each line is a tweet and all its
metadata. This format is handy for doing things like counting the
number of records with a 'wc -l' instead of having to parse the
JSON...which can be expensive when there can be 10M an hour.

//Ed


[CODE4LIB] Fwd: CFP: ESWC12 (with Digital Libraries and Cultural Heritage track) - Abstract Deadline Dec 5

2011-12-02 Thread Jodi Schneider
Of possible interest...

Begin forwarded message:

> Resent-From: public-...@w3.org
> From: Antoine Isaac 
> Date: 2 December 2011 09:01:59 GMT
> To: Antoine Isaac 
> Subject: CFP: ESWC12 (with Digital Libraries and Cultural Heritage track) - 
> Abstract Deadline Dec 5
> 
> Apologies for cross-posting
> ---
> 
> CFP ESWC12 - 9th Extended Semantic Web Conference
> May 27 - 31, 2012
> 
> Info at: http://2012.eswc-conferences.org/
> 
> The mission of ESWC is to bring together researchers and practitioners
> dealing with different aspects of semantic technologies. Building on its
> past success, ESWC is seeking to broaden its focus to span other
> relevant research areas in which semantics in a Web context plays an
> important role. The goal of the Semantic Web is to create a Web of
> knowledge and services in which the semantics of content is made
> explicit and content is linked to both other content and services. This
> network of knowledge-based functionality will weave together a large
> network of human knowledge, and make this knowledge machine-processable
> to support intelligent behaviour by machines. It will support novel
> applications allowing to combine content from heterogeneous sites in
> unforeseen ways and support enhanced matching between users needs and
> content.
> 
> Creating such an interlinked Web of knowledge which spans unstructured,
> RDF as well as multimedia content and services requires the
> collaboration of many disciplines, including but not limited to:
> Artificial Intelligence, Natural Language Processing, Database and
> Information Systems, Information Retrieval, Machine Learning,
> Multimedia, Distributed Systems, Social Networks, Web Engineering, and
> Web Science.
> 
> In addition to the research and in-use tracks, we have furthermore
> introduced two special tracks this year, putting particular emphasis on
> inter-disciplinary research topics and areas that show the potential of
> exciting synergies for the future.
> 
> *Important Dates*
> 
> Abstract submission December 5th, 2011
> Full-paper submission December 12th, 2011
> Notification of acceptance/rejection February 22nd, 2012
> Camera-ready papers March 9th, 2012
> 
> *Additional Information*
> 
> ESWC2012 welcomes the submission of original research and application
> papers dealing with all aspects of representing and using semantics on
> the Web. We encourage theoretical, methodological, empirical, and
> applications papers. The proceedings of this conference will be
> published in Springer's Lecture Notes in Computer Science series. Paper
> submission and reviewing for ESWC2012 will be electronic via the
> conference submissions site. Each paper must be assigned to one of the
> tracks below:
> 
> 
> Program Chairs
> - Philipp Cimiano
> - Axel Polleres
> 
> General Chair
> - Elena Simperl
> 
> 
> Research tracks
> - Ontologies (Chairs: Chiara Ghidini, Dimitris Plexousakis)
> - Reasoning (Chairs: Giovambattista Ianni, Markus Krötzsch)
> - Semantic Data Management (Chairs: Claudio Gutierrez, Andreas Harth)
> - Linked Open Data (Chairs: Sören Auer, Juan Sequeda)
> - Social Web and Web Science (Chairs: Fabien Gandon, Matthew Rowe)
> - Software, Services, Processes and Cloud Computing (Chairs: Matthias
> Klusch, Carlos Pedrinaci)
> - Natural Language Processing and Information Retrieval (Chairs: Paul
> Buitelaar, Johanna Völker)
> - Mobile and Sensor Web (Chairs: Alasdair J G Gray, Kerry Taylor)
> - Machine Learning (Chairs: Claudia d'Amato, Volker Tresp)
> 
> Special tracks 2012
> - EGovernment: Using Semantics for Promoting Interoperability in the
> Public Sector (Chairs: Asunción Gómez-Pérez, Vassilios Peristeras)
> - Digital Libraries and Cultural Heritage (Chairs: Antoine Isaac, Vivien
> Petras)
> 
> In-use & Industrial track
> - In-use & Industrial track (Chairs: Philippe Cudré-Mauroux, Yves Raimond)
> 
> Check the website for detailed descriptions of each of these tracks.
> 
> Papers should not exceed fifteen (15) pages in length and must be
> formatted according to the information for LNCS authors. Papers must be
> submitted in PDF (Adobe's Portable Document Format) format and will not
> be accepted in any other format. Papers that exceed 15 pages or do not
> follow the LNCS guidelines risk being rejected automatically without a
> review. Authors of accepted papers will be required to provide semantic
> annotations for the abstract of their submission - details of this
> process will be provided on the conference Web page at the time of
> acceptance. At least one author of each accepted paper must register for
> the conference. More information about the Springer's Lecture Notes in
> Computer Science (LNCS) are available on the Springer LNCS Web site
> (http://www.springer.com/computer/lncs/lncs+authors).
> 
> Submission will be through the Easychair system:
> https://www.easychair.org/account/signin.cgi?conf=eswc2012
> 


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Daniel Lovins
+1

On Thu, Dec 1, 2011 at 6:50 PM, Fleming, Declan  wrote:
> +1
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ross 
> Singer
> Sent: Thursday, December 01, 2011 5:47 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Pandering for votes for code4lib sessions
>
> As unwilling commissioner of elections, I'm shocked, SHOCKED, I say, to hear 
> of improprieties with the voting process.
>
> That said, I'm not shocked (and we've seen it before).
>
> I am absolutely opposed to:
>
> 1) Setting weights on voting.  0 is just as valid a vote as 3.
> 2) Publicly shaming the offenders in Code4Lib.  If you run across impropriety 
> in a forum, make a friendly, yet firm, reminder that ballot stuffing is 
> unethical, undemocratic and tears at the fabric that is Code4Lib.  Sometimes 
> it just takes a simple reminder for people to realize what they're doing is 
> wrong (it certainly works for me).
> 3) Selection committees.  We are, as Dre points out, anarcho-democratic as 
> our core.  anarcho-bureaucratic just sounds silly.
>
> This current situation is largely our doing.  We even publicly said that 
> "getting your proposal voted in is the backdoor into the conference".  The 
> first allotment of spaces sold out in an hour.  This is, literally, the only 
> way that a person that was not able to register and is buried on the wait 
> list is going to get in.  And we've basically told them that.
>
> One thing I would be open to is to put a disclaimer splash page before any 
> ballot (only to be seen the first time a person votes) briefly explaining how 
> the ballot works and to mention that ballot stuffing is "unethical, 
> undemocratic and tears at the fabric that is Code4Lib" or some such.  I would 
> welcome contributions to the wording.
>
> What would people think about that?
>
> -Ross.
>
> On Thu, Dec 1, 2011 at 8:32 AM, Richard, Joel M  wrote:
>> I disagree with this suggestion. Personally I vote for only those I find 
>> interesting and useful to me, but I don't put an response for every talk 
>> listed. I only respond on those I'm interested. Everyone else gets 0 points. 
>> I would expect that others do this, too. Katherine's suggestion also puts an 
>> burden on those who are legitimately participating while doing nothing to 
>> prevent those who are misbehaving.
>>
>> I like Edwards's suggestions, which are easy to implement and don't really 
>> impact the process that much.
>>
>> Personally, I believe that the proper response to this is to:
>>
>> 1. Publicly shame those who are participating in this. :) 2. Delete
>> their votes, or at least those you can identify.
>> 3. Disqualify the person who is receiving illegitimate votes. See #1.
>> 4. Eliminate voting altogether and have a committee of 10-15 people from the 
>> community select from the proposed talks. Isn't this what other conferences 
>> do?
>>
>> In the end, the conference organizers can invite whoever they want to speak. 
>> The voting ends up being a courtesy to the rest of us.
>>
>> --Joel
>>
>> Joel Richard
>> Lead Web Developer, Web Services Department Smithsonian Institution
>> Libraries | http://www.sil.si.edu/
>> (202) 633-1706 | richar...@si.edu
>>
>>
>>
>>
>>
>>
>>
>>
>> On Dec 1, 2011, at 8:06 AM, Lynch,Katherine wrote:
>>
>>> I was actually going to suggest just this, Kåre!  Another way to
>>> handle it, or perhaps an additional way, would be give a user's votes
>>> a certain amount of weight proportionate to the number of sessions they 
>>> voted on.
>>> So if they evaluated all of them and voted, 100% of their vote gets
>>> counted.  If they evaluated half, 50%, and so on?  Not sure if this
>>> is worth the effort, but I know it's worked for various camps that
>>> I've been to which fall prey to the same problem.
>>>
>>> Sincerely,
>>> Katherine
>>>
>>> On 12/1/11 6:55 AM, "Kåre Fiedler Christiansen"
>>> 
>>> wrote:
>>>
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On
> Behalf Of Michael B. Klein

 

> In any case, I'm interested to see how effective this current "call
> for support" is.

 Me too!

 Could someone with access to the voting data perhaps anonymously
 pull out how many voters have given points to only a single talk or two?

 If the problem is indeed real, perhaps simply stating on the page
 that you are expected to evaluate _all_ proposals, and not just vote
 up a single talk, would help the issue? It might turn away some of
 the "wrong voters". Requiring to give out at least, say, 10 points,
 could be perhaps be a way to enforce some participation?

 Best,
 Kåre



-- 
Daniel Lovins
Head of Knowledge Access, Design & Development
Knowledge Access & Resource Management Services
New York University, Division of Libraries
20 Cooper Square, 3rd floor
New York, NY 10003-7112
daniel.lov...@nyu.edu
212-998-2489


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Ranti Junus
On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
>
> [...]
>
> One thing I would be open to is to put a disclaimer splash page before
> any ballot (only to be seen the first time a person votes) briefly
> explaining how the ballot works and to mention that ballot stuffing is
> "unethical, undemocratic and tears at the fabric that is Code4Lib" or
> some such.  I would welcome contributions to the wording.
>[...]

+1 for the disclaimer splash screen.

Also:
Create a script that detects pandering. When they click the submit button,
rickroll them.  ;-)

ranti.


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Becky Yoose
I am offended and disappointed by the Rickrolling suggestion. We are a
group of professionals and should act as such. Resorting to low brow
internet memes only demeans the group, its members, and a profession as a
whole.

The submit button in the script should go to a page where the submitter has
to correct a malformed XML file before submitting their ballot. For a
greater challenge, we could have the voter manually translate raw MARC into
MARCXML.

Friday come and me wan' go home,
Becky

On Fri, Dec 2, 2011 at 8:38 AM, Ranti Junus  wrote:

> On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
> >
> > [...]
> >
> > One thing I would be open to is to put a disclaimer splash page before
> > any ballot (only to be seen the first time a person votes) briefly
> > explaining how the ballot works and to mention that ballot stuffing is
> > "unethical, undemocratic and tears at the fabric that is Code4Lib" or
> > some such.  I would welcome contributions to the wording.
> >[...]
>
> +1 for the disclaimer splash screen.
>
> Also:
> Create a script that detects pandering. When they click the submit button,
> rickroll them.  ;-)
>
> ranti.
>


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Ranti Junus
Great, here comes the Troublesome Cataloger


In need for some energizer drink,
ranti.

On Fri, Dec 2, 2011 at 9:56 AM, Becky Yoose  wrote:
> I am offended and disappointed by the Rickrolling suggestion. We are a
> group of professionals and should act as such. Resorting to low brow
> internet memes only demeans the group, its members, and a profession as a
> whole.
>
> The submit button in the script should go to a page where the submitter has
> to correct a malformed XML file before submitting their ballot. For a
> greater challenge, we could have the voter manually translate raw MARC into
> MARCXML.
>
> Friday come and me wan' go home,
> Becky
>
> On Fri, Dec 2, 2011 at 8:38 AM, Ranti Junus  wrote:
>
>> On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
>> >
>> > [...]
>> >
>> > One thing I would be open to is to put a disclaimer splash page before
>> > any ballot (only to be seen the first time a person votes) briefly
>> > explaining how the ballot works and to mention that ballot stuffing is
>> > "unethical, undemocratic and tears at the fabric that is Code4Lib" or
>> > some such.  I would welcome contributions to the wording.
>> >[...]
>>
>> +1 for the disclaimer splash screen.
>>
>> Also:
>> Create a script that detects pandering. When they click the submit button,
>> rickroll them.  ;-)
>>
>> ranti.
>>



-- 
Bulk mail.  Postage paid.


Re: [CODE4LIB] Models of MARC in RDF

2011-12-02 Thread Esme Cowles
Owen-

Another strategy for capturing MARC data in RDF is to convert it to MODS (we do 
this using the LoC MARC to MODS stylesheet: 
http://www.loc.gov/standards/marcxml/xslt/MARC21slim2MODS.xsl).  From there, 
it's pretty easy to incorporate into RDF.  There are some issues to be aware 
of, such as how to map the MODS XML names to predicates and how to handle 
elements that can appear in multiple places in the hierarchy.

-Esme
--
Esme Cowles 

"Necessity is the plea for every infringement of human freedom. It is the
 argument of tyrants; it is the creed of slaves." -- William Pitt, 1783

On 11/28/2011, at 8:25 AM, Owen Stephens wrote:

> It would be great to start collecting transforms together - just a quick 
> brain dump of some I'm aware of
> 
> MARC21 transformations
> Cambridge University Library - http://data.lib.cam.ac.uk - transformation 
> made available (in code) from same site
> Open University - http://data.open.ac.uk - specific transform for materials 
> related to teaching, code available at 
> http://code.google.com/p/luceroproject/source/browse/trunk%20luceroproject/OULinkedData/src/uk/ac/open/kmi/lucero/rdfextractor/RDFExtractor.java
>  (MARC transform is in libraryRDFExtraction method)
> COPAC - small set of records from the COPAC Union catalogue - data and 
> transform not yet published
> Podes Projekt - LinkedAuthors - documentation at 
> http://bibpode.no/linkedauthors/doc/Pode-LinkedAuthors-Documentation.pdf - 2 
> stage transformation firstly from MARC to FRBRized version of data, then from 
> FRBRized data to RDF. These linked from documentation
> Podes Project - LinkedNonFiction - documentation at 
> http://bibpode.no/linkednonfiction/doc/Pode-LinkedNonFiction-Documentation.pdf
>  - MARC data transformed using xslt 
> https://github.com/pode/LinkedNonFiction/blob/master/marcslim2n3.xsl
> 
> British Library British National Bibliography - 
> http://www.bl.uk/bibliographic/datafree.html - data model documented, but no 
> code available
> Libris.se - some notes in various presentations/blogposts (e.g. 
> http://dc2008.de/wp-content/uploads/2008/09/malmsten.pdf) but can't find 
> explicit transformation
> Hungarian National library - 
> http://thedatahub.org/dataset/hungarian-national-library-catalog and 
> http://nektar.oszk.hu/wiki/Semantic_web#Implementation - some information on 
> ontologies used but no code or explicit transformation (not 100% sure this is 
> from MARC)
> Talis - implemented in several live catalogues including 
> http://catalogue.library.manchester.ac.uk/  - no documentation or code afaik 
> although some notes in 
> 
> MAB transformation
> HBZ - some of the transformation documented at 
> https://wiki1.hbz-nrw.de/display/SEM/Converting+the+Open+Data+from+the+hbz+to+BIBO,
>  don't think any code published?
> 
> Would be really helpful if more projects published their transformations (or 
> someone told me where to look!)
> 
> Owen
> 
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: o...@ostephens.com
> Telephone: 0121 288 6936
> 
> On 26 Nov 2011, at 15:58, Karen Coyle wrote:
> 
>> A few of the code4lib talk proposals mention projects that have or will 
>> transform MARC records into RDF. If any of you have documentation and/or 
>> examples of this, I would be very interested to see them, even if they are 
>> "under construction."
>> 
>> Thanks,
>> kc
>> 
>> -- 
>> Karen Coyle
>> kco...@kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Beanworks
I like the rickroll idea, myself. Why would we inflict cataloging hell on 
anyone besides yo_bj?

Oh, wait...

:P

Wafted through cyberspace from my iPad

On Dec 2, 2011, at 10:39 AM, Ranti Junus  wrote:

> Great, here comes the Troublesome Cataloger
> 
> 
> In need for some energizer drink,
> ranti.
> 
> On Fri, Dec 2, 2011 at 9:56 AM, Becky Yoose  wrote:
>> I am offended and disappointed by the Rickrolling suggestion. We are a
>> group of professionals and should act as such. Resorting to low brow
>> internet memes only demeans the group, its members, and a profession as a
>> whole.
>> 
>> The submit button in the script should go to a page where the submitter has
>> to correct a malformed XML file before submitting their ballot. For a
>> greater challenge, we could have the voter manually translate raw MARC into
>> MARCXML.
>> 
>> Friday come and me wan' go home,
>> Becky
>> 
>> On Fri, Dec 2, 2011 at 8:38 AM, Ranti Junus  wrote:
>> 
>>> On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
 
 [...]
 
 One thing I would be open to is to put a disclaimer splash page before
 any ballot (only to be seen the first time a person votes) briefly
 explaining how the ballot works and to mention that ballot stuffing is
 "unethical, undemocratic and tears at the fabric that is Code4Lib" or
 some such.  I would welcome contributions to the wording.
 [...]
>>> 
>>> +1 for the disclaimer splash screen.
>>> 
>>> Also:
>>> Create a script that detects pandering. When they click the submit button,
>>> rickroll them.  ;-)
>>> 
>>> ranti.
>>> 
> 
> 
> 
> -- 
> Bulk mail.  Postage paid.


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Frumkin, Jeremy
Speaking of cataloging hell, it appears that some non-library entities
have over-jumped on the bandwagon and are proposing, get this, a metadata
scheme consisting of nearly 250 elements to describe information resources
found in virtual environments such as minecraft, second life (does that
still exist)?, and skyrim. Maybe they need some librarians and techies to
give them advise on such - http://ow.ly/7MGo4

-- jaf


Jeremy Frumkin
Assistant Dean / Chief Technology Strategist
University of Arizona Libraries

+1 520.626.7296
frumk...@u.library.arizona.edu

Hanlon's Razor: "Never attribute to malice that which is adequately
explained by stupidity"





















On 12/2/11 10:31 AM, "Beanworks"  wrote:

>I like the rickroll idea, myself. Why would we inflict cataloging hell on
>anyone besides yo_bj?
>
>Oh, wait...
>
>:P
>
>Wafted through cyberspace from my iPad
>
>On Dec 2, 2011, at 10:39 AM, Ranti Junus  wrote:
>
>> Great, here comes the Troublesome Cataloger
>> 
>> 
>> In need for some energizer drink,
>> ranti.
>> 
>> On Fri, Dec 2, 2011 at 9:56 AM, Becky Yoose  wrote:
>>> I am offended and disappointed by the Rickrolling suggestion. We are a
>>> group of professionals and should act as such. Resorting to low brow
>>> internet memes only demeans the group, its members, and a profession
>>>as a
>>> whole.
>>> 
>>> The submit button in the script should go to a page where the
>>>submitter has
>>> to correct a malformed XML file before submitting their ballot. For a
>>> greater challenge, we could have the voter manually translate raw MARC
>>>into
>>> MARCXML.
>>> 
>>> Friday come and me wan' go home,
>>> Becky
>>> 
>>> On Fri, Dec 2, 2011 at 8:38 AM, Ranti Junus 
>>>wrote:
>>> 
 On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
> 
> [...]
> 
> One thing I would be open to is to put a disclaimer splash page
>before
> any ballot (only to be seen the first time a person votes) briefly
> explaining how the ballot works and to mention that ballot stuffing
>is
> "unethical, undemocratic and tears at the fabric that is Code4Lib" or
> some such.  I would welcome contributions to the wording.
> [...]
 
 +1 for the disclaimer splash screen.
 
 Also:
 Create a script that detects pandering. When they click the submit
button,
 rickroll them.  ;-)
 
 ranti.
 
>> 
>> 
>> 
>> -- 
>> Bulk mail.  Postage paid.


Re: [CODE4LIB] Models of MARC in RDF

2011-12-02 Thread Owen Stephens
Hi Esme - thanks for this. Do you have any documentation on which predicates 
you've used and MODS->RDF transformation?

Owen

On 2 Dec 2011, at 16:07, Esme Cowles  wrote:

> Owen-
> 
> Another strategy for capturing MARC data in RDF is to convert it to MODS (we 
> do this using the LoC MARC to MODS stylesheet: 
> http://www.loc.gov/standards/marcxml/xslt/MARC21slim2MODS.xsl).  From there, 
> it's pretty easy to incorporate into RDF.  There are some issues to be aware 
> of, such as how to map the MODS XML names to predicates and how to handle 
> elements that can appear in multiple places in the hierarchy.
> 
> -Esme
> --
> Esme Cowles 
> 
> "Necessity is the plea for every infringement of human freedom. It is the
> argument of tyrants; it is the creed of slaves." -- William Pitt, 1783
> 
> On 11/28/2011, at 8:25 AM, Owen Stephens wrote:
> 
>> It would be great to start collecting transforms together - just a quick 
>> brain dump of some I'm aware of
>> 
>> MARC21 transformations
>> Cambridge University Library - http://data.lib.cam.ac.uk - transformation 
>> made available (in code) from same site
>> Open University - http://data.open.ac.uk - specific transform for materials 
>> related to teaching, code available at 
>> http://code.google.com/p/luceroproject/source/browse/trunk%20luceroproject/OULinkedData/src/uk/ac/open/kmi/lucero/rdfextractor/RDFExtractor.java
>>  (MARC transform is in libraryRDFExtraction method)
>> COPAC - small set of records from the COPAC Union catalogue - data and 
>> transform not yet published
>> Podes Projekt - LinkedAuthors - documentation at 
>> http://bibpode.no/linkedauthors/doc/Pode-LinkedAuthors-Documentation.pdf - 2 
>> stage transformation firstly from MARC to FRBRized version of data, then 
>> from FRBRized data to RDF. These linked from documentation
>> Podes Project - LinkedNonFiction - documentation at 
>> http://bibpode.no/linkednonfiction/doc/Pode-LinkedNonFiction-Documentation.pdf
>>  - MARC data transformed using xslt 
>> https://github.com/pode/LinkedNonFiction/blob/master/marcslim2n3.xsl
>> 
>> British Library British National Bibliography - 
>> http://www.bl.uk/bibliographic/datafree.html - data model documented, but no 
>> code available
>> Libris.se - some notes in various presentations/blogposts (e.g. 
>> http://dc2008.de/wp-content/uploads/2008/09/malmsten.pdf) but can't find 
>> explicit transformation
>> Hungarian National library - 
>> http://thedatahub.org/dataset/hungarian-national-library-catalog and 
>> http://nektar.oszk.hu/wiki/Semantic_web#Implementation - some information on 
>> ontologies used but no code or explicit transformation (not 100% sure this 
>> is from MARC)
>> Talis - implemented in several live catalogues including 
>> http://catalogue.library.manchester.ac.uk/  - no documentation or code afaik 
>> although some notes in 
>> 
>> MAB transformation
>> HBZ - some of the transformation documented at 
>> https://wiki1.hbz-nrw.de/display/SEM/Converting+the+Open+Data+from+the+hbz+to+BIBO,
>>  don't think any code published?
>> 
>> Would be really helpful if more projects published their transformations (or 
>> someone told me where to look!)
>> 
>> Owen
>> 
>> Owen Stephens
>> Owen Stephens Consulting
>> Web: http://www.ostephens.com
>> Email: o...@ostephens.com
>> Telephone: 0121 288 6936
>> 
>> On 26 Nov 2011, at 15:58, Karen Coyle wrote:
>> 
>>> A few of the code4lib talk proposals mention projects that have or will 
>>> transform MARC records into RDF. If any of you have documentation and/or 
>>> examples of this, I would be very interested to see them, even if they are 
>>> "under construction."
>>> 
>>> Thanks,
>>> kc
>>> 
>>> -- 
>>> Karen Coyle
>>> kco...@kcoyle.net http://kcoyle.net
>>> ph: 1-510-540-7596
>>> m: 1-510-435-8234
>>> skype: kcoylenet


Re: [CODE4LIB] Models of MARC in RDF

2011-12-02 Thread Owen Stephens
Oh - and perhaps just/more importantly - how do you create URIs for you data 
and how do you reconcile against other sources?

Owen

On 2 Dec 2011, at 16:07, Esme Cowles  wrote:

> Owen-
> 
> Another strategy for capturing MARC data in RDF is to convert it to MODS (we 
> do this using the LoC MARC to MODS stylesheet: 
> http://www.loc.gov/standards/marcxml/xslt/MARC21slim2MODS.xsl).  From there, 
> it's pretty easy to incorporate into RDF.  There are some issues to be aware 
> of, such as how to map the MODS XML names to predicates and how to handle 
> elements that can appear in multiple places in the hierarchy.
> 
> -Esme
> --
> Esme Cowles 
> 
> "Necessity is the plea for every infringement of human freedom. It is the
> argument of tyrants; it is the creed of slaves." -- William Pitt, 1783
> 
> On 11/28/2011, at 8:25 AM, Owen Stephens wrote:
> 
>> It would be great to start collecting transforms together - just a quick 
>> brain dump of some I'm aware of
>> 
>> MARC21 transformations
>> Cambridge University Library - http://data.lib.cam.ac.uk - transformation 
>> made available (in code) from same site
>> Open University - http://data.open.ac.uk - specific transform for materials 
>> related to teaching, code available at 
>> http://code.google.com/p/luceroproject/source/browse/trunk%20luceroproject/OULinkedData/src/uk/ac/open/kmi/lucero/rdfextractor/RDFExtractor.java
>>  (MARC transform is in libraryRDFExtraction method)
>> COPAC - small set of records from the COPAC Union catalogue - data and 
>> transform not yet published
>> Podes Projekt - LinkedAuthors - documentation at 
>> http://bibpode.no/linkedauthors/doc/Pode-LinkedAuthors-Documentation.pdf - 2 
>> stage transformation firstly from MARC to FRBRized version of data, then 
>> from FRBRized data to RDF. These linked from documentation
>> Podes Project - LinkedNonFiction - documentation at 
>> http://bibpode.no/linkednonfiction/doc/Pode-LinkedNonFiction-Documentation.pdf
>>  - MARC data transformed using xslt 
>> https://github.com/pode/LinkedNonFiction/blob/master/marcslim2n3.xsl
>> 
>> British Library British National Bibliography - 
>> http://www.bl.uk/bibliographic/datafree.html - data model documented, but no 
>> code available
>> Libris.se - some notes in various presentations/blogposts (e.g. 
>> http://dc2008.de/wp-content/uploads/2008/09/malmsten.pdf) but can't find 
>> explicit transformation
>> Hungarian National library - 
>> http://thedatahub.org/dataset/hungarian-national-library-catalog and 
>> http://nektar.oszk.hu/wiki/Semantic_web#Implementation - some information on 
>> ontologies used but no code or explicit transformation (not 100% sure this 
>> is from MARC)
>> Talis - implemented in several live catalogues including 
>> http://catalogue.library.manchester.ac.uk/  - no documentation or code afaik 
>> although some notes in 
>> 
>> MAB transformation
>> HBZ - some of the transformation documented at 
>> https://wiki1.hbz-nrw.de/display/SEM/Converting+the+Open+Data+from+the+hbz+to+BIBO,
>>  don't think any code published?
>> 
>> Would be really helpful if more projects published their transformations (or 
>> someone told me where to look!)
>> 
>> Owen
>> 
>> Owen Stephens
>> Owen Stephens Consulting
>> Web: http://www.ostephens.com
>> Email: o...@ostephens.com
>> Telephone: 0121 288 6936
>> 
>> On 26 Nov 2011, at 15:58, Karen Coyle wrote:
>> 
>>> A few of the code4lib talk proposals mention projects that have or will 
>>> transform MARC records into RDF. If any of you have documentation and/or 
>>> examples of this, I would be very interested to see them, even if they are 
>>> "under construction."
>>> 
>>> Thanks,
>>> kc
>>> 
>>> -- 
>>> Karen Coyle
>>> kco...@kcoyle.net http://kcoyle.net
>>> ph: 1-510-540-7596
>>> m: 1-510-435-8234
>>> skype: kcoylenet


Re: [CODE4LIB] Pandering for votes for code4lib sessions

2011-12-02 Thread Colford, Scot
Doh! Motherscratcher!

\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/

Scot Colford
Web Services Manager
Boston Public Library

scolf...@bpl.org
Phone 617.859.2399
Mobile 617.592.8669
Fax 617.536.7558







On 12/2/11 12:39 PM, "Frumkin, Jeremy" 
wrote:

>Speaking of cataloging hell, it appears that some non-library entities
>have over-jumped on the bandwagon and are proposing, get this, a metadata
>scheme consisting of nearly 250 elements to describe information resources
>found in virtual environments such as minecraft, second life (does that
>still exist)?, and skyrim. Maybe they need some librarians and techies to
>give them advise on such - http://ow.ly/7MGo4
>
>-- jaf
>
>
>Jeremy Frumkin
>Assistant Dean / Chief Technology Strategist
>University of Arizona Libraries
>
>+1 520.626.7296
>frumk...@u.library.arizona.edu
>
>Hanlon's Razor: "Never attribute to malice that which is adequately
>explained by stupidity"
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>On 12/2/11 10:31 AM, "Beanworks"  wrote:
>
>>I like the rickroll idea, myself. Why would we inflict cataloging hell on
>>anyone besides yo_bj?
>>
>>Oh, wait...
>>
>>:P
>>
>>Wafted through cyberspace from my iPad
>>
>>On Dec 2, 2011, at 10:39 AM, Ranti Junus  wrote:
>>
>>> Great, here comes the Troublesome Cataloger
>>> 
>>> 
>>> In need for some energizer drink,
>>> ranti.
>>> 
>>> On Fri, Dec 2, 2011 at 9:56 AM, Becky Yoose  wrote:
 I am offended and disappointed by the Rickrolling suggestion. We are a
 group of professionals and should act as such. Resorting to low brow
 internet memes only demeans the group, its members, and a profession
as a
 whole.
 
 The submit button in the script should go to a page where the
submitter has
 to correct a malformed XML file before submitting their ballot. For a
 greater challenge, we could have the voter manually translate raw MARC
into
 MARCXML.
 
 Friday come and me wan' go home,
 Becky
 
 On Fri, Dec 2, 2011 at 8:38 AM, Ranti Junus 
wrote:
 
> On Dec 1, 2011 8:48 AM, "Ross Singer"  wrote:
>> 
>> [...]
>> 
>> One thing I would be open to is to put a disclaimer splash page
>>before
>> any ballot (only to be seen the first time a person votes) briefly
>> explaining how the ballot works and to mention that ballot stuffing
>>is
>> "unethical, undemocratic and tears at the fabric that is Code4Lib"
>>or
>> some such.  I would welcome contributions to the wording.
>> [...]
> 
> +1 for the disclaimer splash screen.
> 
> Also:
> Create a script that detects pandering. When they click the submit
>button,
> rickroll them.  ;-)
> 
> ranti.
> 
>>> 
>>> 
>>> 
>>> -- 
>>> Bulk mail.  Postage paid.


Re: [CODE4LIB] Models of MARC in RDF

2011-12-02 Thread Esme Cowles
Owen-

We assign ARKs[1] to our objects (and predicates for that matter).  The issue 
of reconciling against other sources hasn't come as much, since we have mostly 
focused on our unique objects.  But we have worked on that issue some.  For 
example, several years ago, I worked on the UCAI project, where we mapped 
several slide collections to a common schema[2] and did quite a bit of work 
trying to build work records for the collections that didn't have them, and 
match work records across collections.  That project didn't produce a 
copy-cataloging service like we'd hoped, though the Getty is now working on a 
registry[3] of works of art, which would the task of matching records a lot 
simpler.

1. https://wiki.ucop.edu/display/Curation/ARK
2. http://www.loc.gov/standards/vracore/
3. http://www.getty.edu/research/tools/vocabularies/cona/index.html

-Esme
--
Esme Cowles 

"In the old days, an operating system was designed to optimize the
 utilization of the computer's resources. In the future, its main goal
 will be to optimize the user's time." -- Jakob Nielsen

On 12/2/2011, at 1:37 PM, Owen Stephens wrote:

> Oh - and perhaps just/more importantly - how do you create URIs for you data 
> and how do you reconcile against other sources?
> 
> Owen
> 
> On 2 Dec 2011, at 16:07, Esme Cowles  wrote:
> 
>> Owen-
>> 
>> Another strategy for capturing MARC data in RDF is to convert it to MODS (we 
>> do this using the LoC MARC to MODS stylesheet: 
>> http://www.loc.gov/standards/marcxml/xslt/MARC21slim2MODS.xsl).  From there, 
>> it's pretty easy to incorporate into RDF.  There are some issues to be aware 
>> of, such as how to map the MODS XML names to predicates and how to handle 
>> elements that can appear in multiple places in the hierarchy.
>> 
>> -Esme
>> --
>> Esme Cowles 
>> 
>> "Necessity is the plea for every infringement of human freedom. It is the
>> argument of tyrants; it is the creed of slaves." -- William Pitt, 1783
>> 
>> On 11/28/2011, at 8:25 AM, Owen Stephens wrote:
>> 
>>> It would be great to start collecting transforms together - just a quick 
>>> brain dump of some I'm aware of
>>> 
>>> MARC21 transformations
>>> Cambridge University Library - http://data.lib.cam.ac.uk - transformation 
>>> made available (in code) from same site
>>> Open University - http://data.open.ac.uk - specific transform for materials 
>>> related to teaching, code available at 
>>> http://code.google.com/p/luceroproject/source/browse/trunk%20luceroproject/OULinkedData/src/uk/ac/open/kmi/lucero/rdfextractor/RDFExtractor.java
>>>  (MARC transform is in libraryRDFExtraction method)
>>> COPAC - small set of records from the COPAC Union catalogue - data and 
>>> transform not yet published
>>> Podes Projekt - LinkedAuthors - documentation at 
>>> http://bibpode.no/linkedauthors/doc/Pode-LinkedAuthors-Documentation.pdf - 
>>> 2 stage transformation firstly from MARC to FRBRized version of data, then 
>>> from FRBRized data to RDF. These linked from documentation
>>> Podes Project - LinkedNonFiction - documentation at 
>>> http://bibpode.no/linkednonfiction/doc/Pode-LinkedNonFiction-Documentation.pdf
>>>  - MARC data transformed using xslt 
>>> https://github.com/pode/LinkedNonFiction/blob/master/marcslim2n3.xsl
>>> 
>>> British Library British National Bibliography - 
>>> http://www.bl.uk/bibliographic/datafree.html - data model documented, but 
>>> no code available
>>> Libris.se - some notes in various presentations/blogposts (e.g. 
>>> http://dc2008.de/wp-content/uploads/2008/09/malmsten.pdf) but can't find 
>>> explicit transformation
>>> Hungarian National library - 
>>> http://thedatahub.org/dataset/hungarian-national-library-catalog and 
>>> http://nektar.oszk.hu/wiki/Semantic_web#Implementation - some information 
>>> on ontologies used but no code or explicit transformation (not 100% sure 
>>> this is from MARC)
>>> Talis - implemented in several live catalogues including 
>>> http://catalogue.library.manchester.ac.uk/  - no documentation or code 
>>> afaik although some notes in 
>>> 
>>> MAB transformation
>>> HBZ - some of the transformation documented at 
>>> https://wiki1.hbz-nrw.de/display/SEM/Converting+the+Open+Data+from+the+hbz+to+BIBO,
>>>  don't think any code published?
>>> 
>>> Would be really helpful if more projects published their transformations 
>>> (or someone told me where to look!)
>>> 
>>> Owen
>>> 
>>> Owen Stephens
>>> Owen Stephens Consulting
>>> Web: http://www.ostephens.com
>>> Email: o...@ostephens.com
>>> Telephone: 0121 288 6936
>>> 
>>> On 26 Nov 2011, at 15:58, Karen Coyle wrote:
>>> 
 A few of the code4lib talk proposals mention projects that have or will 
 transform MARC records into RDF. If any of you have documentation and/or 
 examples of this, I would be very interested to see them, even if they are 
 "under construction."
 
 Thanks,
 kc
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1

[CODE4LIB] Position open: Digitization Technologist

2011-12-02 Thread Jody DeRidder
** Please excuse cross-postings **

Digital Services at the University of Alabama in Tuscaloosa, Alabama is
seeking excellent candidates for new positions.

The Digitization Technologist will analyze technical problems and devise
solutions, as well as seek out methods for new and better functionality.
Research into improved methods of delivery and preservation support is
expected.

The Digitization Technologist position requires an individual who is
self-motivated, curious, and eager to learn and explore. The successful
applicant will have a good base of understanding of a variety of
technologies related to digitization and digital libraries, and will be
capable of quickly processing and integrating new technical information
and developments. This position requires strong analytical problem-solving
capabilities and technical expertise. Command-line scripting capabilities
are expected.

This Digitization Technologist will be involved in developing software
support for digitization, preservation and delivery work flows.
Additionally, this position will be responsible for capture, quality
control, optimization, gathering of administrative, technical, structural,
and descriptive metadata, and tracking, archiving and storage of resulting
digital objects. This position will maintain appropriate hardware and
software and perform other duties as needed. Supervision of up to two (2)
students may be assigned.

This is an exempt (salaried) full time position with benefits.
Minimum pay:  $26,062.40 per year.

Required Minimum Qualifications:

*Associates degree or higher and at least two (2) years of training or
experience in digitization software, hardware, and applications; OR a
Bachelor's degree or higher and at least six (6) months of training or
experience in digitization software, hardware, and applications.

*Competence in basic computer skills which include use of Windows and/or
Macintosh desktops, Office Productivity Suite, demonstrated use of
Photoshop and Bridge or similar software programs and other digital
conversion software. Fluency in Microsoft Excel.

* Evidence of self-motivation and eagerness to learn and explore.
* Evidence of understanding of a variety of technologies related to
digitization and digital libraries.
* Evidence of capability to quickly process and integrate new technical
information and developments.
* Evidence of strong analytical problem-solving capabilities.
Demonstrated technical expertise, including command-line scripting such as
MS Basic, Perl and Python.
* Demonstrated research capabilities.
* Demonstrated knowledge in the creation and management of digital
information.
* Excellent organizational skills: the ability to juggle multiple
competing priorities and achieve goals, and to manage projects from design
to completion.
* Interest in archival formats and archival and delivery standards.
* Sensitivity to the handling issues of fragile library materials.
* Adaptability.
* Strong oral and written communication skills.
* Ability to work independently and effectively in a collaborative setting.


Preferred Qualifications:
-
* Demonstrated expertise in the creation and management of digital
information in a library, archive, or museum setting.
* Expertise in color calibration, color management systems, audio
digitization and optimization.
* Knowledge of diacritics and encoding issues, particularly between MS
Office product exports and UTF-8.
* Knowledge of metadata standards such as EAD and MODS.
* Background in photography or digital imaging technologies
OR background in analog/digital audio capture and editing.
* Working knowledge of XML, HTML, and XSLT.
* Knowledge of metadata standards such as EAD and MODS.
* Experience in designing and interacting with databases (Access, SQL).

For more information, or to apply for this position, please go to
http://staffjobs.ua.edu and search for "Digitization Technologist" in the
Working Title field.



Jody DeRidder
Digital Services
University of Alabama Libraries
Tuscaloosa, Alabama 35487
(205) 348-0511
j...@jodyderidder.com
jlderid...@ua.edu


[CODE4LIB] Position open: Digital Repository Coordinator

2011-12-02 Thread Jody DeRidder
** Please excuse cross-postings **

Digital Services at the University of Alabama in Tuscaloosa, Alabama is
seeking excellent candidates for new positions.

The Digital Repository Coordinator will be focused on the management and
support of our growing repository of content, as well as development of
support for new and better functionality. This person will research
software developments in the field and potentially modify those we adopt
to meet our own needs. He or she would be deeply involved in
infrastructure development and support for long-term access. The Digital
Repository Coordinator will analyze technical and work flow problems and
devise and implement solutions.

This Digital Repository Coordinator will be involved in developing and
managing software support for digitization, preservation and delivery work
flows. Additionally, this position will be responsible for capture,
quality control, optimization, gathering of administrative, technical,
structural, and descriptive metadata, and tracking, archiving and storage
of resulting digital objects. This position will maintain appropriate
hardware and software and perform other duties as needed. Supervision of
up to two (2) students may be assigned.

This is an exempt (salaried) full time position with benefits.
Minimum pay:  $26,062.40 per year.

Required Minimum Qualifications:

*Associates degree or higher and at least two (2) years of training or
experience in digitization software, hardware, and applications; OR a
Bachelor's degree or higher and at least six (6) months of training or
experience in digitization software, hardware, and applications.

*Competence in basic computer skills which include use of Windows and/or
Macintosh desktops, Office Productivity Suite, demonstrated use of
Photoshop and Bridge or similar software programs and other digital
conversion software. Fluency in Microsoft Excel.

* Evidence of self-motivation and eagerness to learn and explore.
* Evidence of understanding of a variety of technologies related to
digitization and digital libraries.
* Evidence of capability to quickly processing and integrate new technical
information and developments.
* Evidence of strong analytical problem-solving capabilities.
Demonstrated technical expertise, including command-line scripting such as
MS Basic, Perl and Python.
* Capability of reading and successfully modifying code in languages such
as JavaScript, JSON, PHP, Java, XSLT.
* Working knowledge of XML and HTML.
* Experience in designing and interacting with SQL databases.
* Demonstrated research capabilities.
* Demonstrated knowledge in the creation and management of digital
information.
* Excellent organizational skills: the ability to juggle multiple
competing priorities and achieve goals, and to manage projects from design
to completion.
* Interest in archival formats and archival and delivery standards.
* Sensitivity to the handling issues of fragile library materials.
* Adaptability.
* Strong oral and written communication skills.
* Ability to work independently and effectively in a collaborative setting.


Preferred Qualifications:
-
* Demonstrated expertise in the creation and management of digital
information in a library, archive, or museum setting.
* Expertise in color calibration, color management systems, audio
digitization and optimization.
* Knowledge of diacritics and encoding issues, particularly between MS
Office product exports and UTF-8.
* Knowledge of metadata standards such as EAD and MODS.
* Experience in computer programming, both server-side and web delivery
* Academic degree in Computer Science.
* Background in photography or digital imaging technologies
OR background in analog/digital audio capture and editing.
* Working knowledge of XML, HTML, and XSLT.
* Knowledge of metadata standards such as EAD and MODS.
* Experience in designing and interacting with databases (Access, SQL).

For more information, or to apply for this position, please go to
http://staffjobs.ua.edu and search for "Digital Repository Coordinator" in
the Working Title field.



Jody DeRidder
Digital Services
University of Alabama Libraries
Tuscaloosa, Alabama 35487
(205) 348-0511
j...@jodyderidder.com
jlderid...@ua.edu