[CODE4LIB] Seth Godin on The future of the library

2011-05-16 Thread Mike Taylor
Seth Godin is not a library professional -- he's a marketing guru with
a string of best-selling books and a blog that manages to be both
insightful AND brief on an astonishingly consistent basis.
(http://sethgodin.typepad.com/ -- highly recommended).  So he's
outside the library world, looking in, and has a track record of
seeing far and clear.

Which means he's probably worth paying attention to when he writes
about The Future Of The Library, as he does in the newest post on his
blog:

http://sethgodin.typepad.com/seths_blog/2011/05/the-future-of-the-library.html

To summarise: "The library is a house for the librarian ...  [Kids]
need a librarian more than ever (to figure out creative ways to find
and use data). They need a library not at all ...  We need librarians
more than we ever did. What we don't need are mere clerks who guard
dead paper."

-- Mike.


[CODE4LIB] linked data endpoints

2011-05-16 Thread Eric Lease Morgan
What are some of the ways to best insert Linked Data endpoints into an XML file?

I have been playing lately with named-entity recognition/extraction technology. 
[1] Feed a text file, such as a novel, into the recognition program. Get back a 
rudimentary XML file where things like names, places, and organizations are 
marked with simple tags. I can then extract all the place names from a text, 
tabulate them, display a word-cloud, allow the reader to select items, guess 
latitude and longitude of the place, and finally plot them on a map. [2] This 
process works pretty well, but Google Maps only allows me to plot a limited 
number of items at a time. Consequently, I am thinking about preprocessing my 
data by looping through the XML file and adding latitude and longitude 
attributes to the place name elements.

I then got to thinking about names and organizations. It would be nice to 
supplement these entities with canonical Linked Data endpoints. My application 
could then read the endpoints, extract the links associated with them, and 
display some sort of graphic illustrating relationships. Finally, I could allow 
the reader to select a relationship for further investigation.

Given a name -- say, Plato or Thoreau -- how would one go about identifying 
good endpoints? What sort of query would I send to what sort of "database"? 
What might I get back? Assuming my goal is to enrich the text, what sort of 
link(s) should I insert into my XML?

[1] NER - http://bit.ly/e0SnA6
[2] geo-location for WebKit mobile - http://bit.ly/msIu16

-- 
Eric Morgan
University of Notre Dame


[CODE4LIB] EADitor: XForms for EAD beta .1105 released

2011-05-16 Thread Ethan Gruber
Apologies to those who may also be on the EAD list who would have already
received this email.  EADitor is one of several active XForms projects
detailed in "XForms for Libraries: An Introduction", an article in the 11th
issue of the code4lib journal (http://journal.code4lib.org/articles/3916)

*

I'm pleased to announce a new, much overdue,
EADitorbeta, .1105.

EADitor is an XForms framework for the creation and editing of Encoded
Archival Description  (EAD) finding aids using
Orbeon , an enterprise-level XForms Java
application, which runs in Apache Tomcat.  Although the web form is
certainly the most important aspect of the application since it can be
integrated with existing content management and dissemination systems,
EADitor also includes an easily customizable public interface for searching,
sorting, and browsing collections of finding aids. This enables institutions
to use a single application for content creation and publication.

FEATURES
* Create and edit EAD finding aids adhering to the EAD 2002 schema (elements
are represented at almost every level in the finding aid, with the notable
exception of mixed content at the paragraph level).
* Import EAD 2002 schema or DTD-compliant finding aids into EADitor
* An administrative user interface for publishing/unpublishing finding aids
* Simple component reordering interface
* Controlled vocabulary integration with auto-suggest, including LCSH terms
and local vocabularies in subject, persname, famname, corpname, geogname,
and genreform.  Languages refer to controlled vocabulary also.
* Set default templates for the EAD core and components
* A form for setting agency codes
* Public interface for searching, browsing, and viewing finding aids (based
on Solr).
* Atom feed for published finding aids

EADitor is still a work in progress, but will advance more consistently now
that it is officially supported by the American Numismatic Society.
Ultimately, I would like to integrate other controlled vocabulary services.
One of the most important issues to address moving forward is better
documentation.

MORE INFORMATION

EADitor project site (Google Code): http://code.google.com/p/eaditor/
Installation instructions (specific for Ubuntu but broadly applies to all
Unix-based systems):
http://code.google.com/p/eaditor/wiki/UbuntuInstallation
Google Group: http://groups.google.com/group/eaditor
EADitor blog (will be the primary medium for providing information and
progress on the project): http://eaditor.blogspot.com/


Re: [CODE4LIB] linked data endpoints

2011-05-16 Thread Arash.Joorabchi
Hi Eric,

If you think wikipedia articles could be used as good endpoints for your
purposes then have a look at this opensource tool
http://wikipedia-miner.sourceforge.net/

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Eric Lease Morgan
Sent: 16 May 2011 13:34
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] linked data endpoints

What are some of the ways to best insert Linked Data endpoints into an
XML file?

I have been playing lately with named-entity recognition/extraction
technology. [1] Feed a text file, such as a novel, into the recognition
program. Get back a rudimentary XML file where things like names,
places, and organizations are marked with simple tags. I can then
extract all the place names from a text, tabulate them, display a
word-cloud, allow the reader to select items, guess latitude and
longitude of the place, and finally plot them on a map. [2] This process
works pretty well, but Google Maps only allows me to plot a limited
number of items at a time. Consequently, I am thinking about
preprocessing my data by looping through the XML file and adding
latitude and longitude attributes to the place name elements.

I then got to thinking about names and organizations. It would be nice
to supplement these entities with canonical Linked Data endpoints. My
application could then read the endpoints, extract the links associated
with them, and display some sort of graphic illustrating relationships.
Finally, I could allow the reader to select a relationship for further
investigation.

Given a name -- say, Plato or Thoreau -- how would one go about
identifying good endpoints? What sort of query would I send to what sort
of "database"? What might I get back? Assuming my goal is to enrich the
text, what sort of link(s) should I insert into my XML?

[1] NER - http://bit.ly/e0SnA6
[2] geo-location for WebKit mobile - http://bit.ly/msIu16

-- 
Eric Morgan
University of Notre Dame


Re: [CODE4LIB] linked data endpoints

2011-05-16 Thread Devon
Eric,

Jean Godby and I have been looking into this very problem. First, I
want to draw your attention to the difference between NER and the
subsequent problem of Identity Resolution. For example, in a given
text, an NER tool would identify "Kennedy" as a name, but that name
could refer to several different people. If you're able to get more
information (dates, titles, etc) from the text for a given reference,
you can do a better job of resolving the correct identity. Second,
Jean and I planned to use WorldCat Identities [1] as our end-point and
as a part of our identity resolution mechanism. With extra data, like
a birth and/or death year, you can really zero in on an identity.

[1] http://www.worldcat.org/identities

/dev

-- 
Devon Smith
Consulting Software Engineer
OCLC Office of Research
http://www.oclc.org/research/people/smith.htm

On Mon, May 16, 2011 at 8:33 AM, Eric Lease Morgan  wrote:
> What are some of the ways to best insert Linked Data endpoints into an XML 
> file?
>
> I have been playing lately with named-entity recognition/extraction 
> technology. [1] Feed a text file, such as a novel, into the recognition 
> program. Get back a rudimentary XML file where things like names, places, and 
> organizations are marked with simple tags. I can then extract all the place 
> names from a text, tabulate them, display a word-cloud, allow the reader to 
> select items, guess latitude and longitude of the place, and finally plot 
> them on a map. [2] This process works pretty well, but Google Maps only 
> allows me to plot a limited number of items at a time. Consequently, I am 
> thinking about preprocessing my data by looping through the XML file and 
> adding latitude and longitude attributes to the place name elements.
>
> I then got to thinking about names and organizations. It would be nice to 
> supplement these entities with canonical Linked Data endpoints. My 
> application could then read the endpoints, extract the links associated with 
> them, and display some sort of graphic illustrating relationships. Finally, I 
> could allow the reader to select a relationship for further investigation.
>
> Given a name -- say, Plato or Thoreau -- how would one go about identifying 
> good endpoints? What sort of query would I send to what sort of "database"? 
> What might I get back? Assuming my goal is to enrich the text, what sort of 
> link(s) should I insert into my XML?
>
> [1] NER - http://bit.ly/e0SnA6
> [2] geo-location for WebKit mobile - http://bit.ly/msIu16
>
> --
> Eric Morgan
> University of Notre Dame
>



-- 
Sent from my GMail account.


Re: [CODE4LIB] linked data endpoints

2011-05-16 Thread Ross Singer
Hi Eric,

I'm not sure that there can be answers to your question without some
more information first (and, possibly, more mining on your part).

First off, these "names" -- can they be fairly confidently identified
as identities?  If so, VIAF (http://viaf.org/) would be your first
step.  While VIAF has some links to dbpedia/wikipedia (at least they
used to), they don't seem to for either of your examples (Plato,
Thoreau), however, they do provide owl:sameAs links to the National
Library of Sweden and the German National Library's resources which
have much better coverage to dbpedia/wikipedia.

You might also want to look at OpenCalais for identifying resources in
your text: http://www.opencalais.com/.  Also Sindice
(http://sindice.com/), but to be useful, you're going to need to
filter it considerably.

Anyway, you're going to have to know what you have and what sort of
thing you're hoping to link to before you can do much.

-Ross.

On Mon, May 16, 2011 at 8:33 AM, Eric Lease Morgan  wrote:
> What are some of the ways to best insert Linked Data endpoints into an XML 
> file?
>
> I have been playing lately with named-entity recognition/extraction 
> technology. [1] Feed a text file, such as a novel, into the recognition 
> program. Get back a rudimentary XML file where things like names, places, and 
> organizations are marked with simple tags. I can then extract all the place 
> names from a text, tabulate them, display a word-cloud, allow the reader to 
> select items, guess latitude and longitude of the place, and finally plot 
> them on a map. [2] This process works pretty well, but Google Maps only 
> allows me to plot a limited number of items at a time. Consequently, I am 
> thinking about preprocessing my data by looping through the XML file and 
> adding latitude and longitude attributes to the place name elements.
>
> I then got to thinking about names and organizations. It would be nice to 
> supplement these entities with canonical Linked Data endpoints. My 
> application could then read the endpoints, extract the links associated with 
> them, and display some sort of graphic illustrating relationships. Finally, I 
> could allow the reader to select a relationship for further investigation.
>
> Given a name -- say, Plato or Thoreau -- how would one go about identifying 
> good endpoints? What sort of query would I send to what sort of "database"? 
> What might I get back? Assuming my goal is to enrich the text, what sort of 
> link(s) should I insert into my XML?
>
> [1] NER - http://bit.ly/e0SnA6
> [2] geo-location for WebKit mobile - http://bit.ly/msIu16
>
> --
> Eric Morgan
> University of Notre Dame
>


[CODE4LIB] code4lib NYC SIG meeting Tuesday

2011-05-16 Thread Yitzchak Schaffer
Apologies for cross-post. Just a last-minute reminder of the code4lib 
NYC METRO SIG meeting tomorrow morning, Tues May 17, 10a-12n at METRO, 
57 East 11th Street in New York. Come hang out, share your latest 
project, etc. etc.


code4lib is:
http://code4lib.org/about

Free! Register at:
http://www.metro.org/en/cev/86

--
Yitzchak Schaffer
Systems Manager
Touro College Libraries
212.742.8770 ext. 2432
http://www.tourolib.org/


Re: [CODE4LIB] ajaxy CRUD / weeding helper

2011-05-16 Thread Doran, Michael D
Hi Ken,

If Wittenberg University were a Voyager ILS library, I would point you towards 
the (free, open-source) ShelfLister client [1].  Although not "ajaxy", one of 
the specific use cases is collection development/weeding projects and it meets 
many of your requirements:

 - current shelf-list
 - useful bibdata like title, pubinfo [...] total-checkouts
 - iPad/laptop-friendly
 - interface that will allow us to mark individual fields

Plus it has some nice extra features: The shelf list view allows the user to 
toggle between call numbers and titles.  The item view includes links to the 
OPAC, WorldCat, and Google Books.  The back-end is the actual underlying ILS 
database, so all the data (including item status) is real-time.

This doesn't help you (unless you want to try and port it to your ILS), but 
Voyager libraries might want to take a look.  Presentations given at the ELUNA 
conference last week provide an introduction to the current version of the 
client and show how to utilize the marked items file to bulk update the catalog 
(if desired) [2].

-- Michael

[1] http://rocky.uta.edu/doran/shelflister/

[2] http://rocky.uta.edu/doran/presentations/ShelfLister20intro.pptx
http://rocky.uta.edu/doran/presentations/ShelfLister20markeditemsEluna.pptx

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# do...@uta.edu
# http://rocky.uta.edu/doran/

> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ken
> Irwin
> Sent: Thursday, May 12, 2011 8:07 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: [CODE4LIB] ajaxy CRUD / weeding helper
> 
> Hi all,
> 
> I'm about to embark upon a summer weeding project, and would like to do so
> with the help of a little web tool - perhaps one that you've already invented
> or for which a generic AJAX-based CRUD interface already exists. (Mostly I
> think I'm just looking for a low-power AJAX-based CRUD thing.) I'm going to
> describe what I want it to do, and perhaps you can tell me if you think
> someone has already done the heavy lifting on creating something like this.
> 
> Back end: a database containing the current shelf-list along with some useful
> bibdata like title, pubinfo, last-checkout-date, total-checkouts, date added
> to collection, plus some fields for the information we'll be inputing on the
> front end.
> 
> Front end: an iPad/laptop-friendly touch-or-click interface that will allow
> us to mark individual fields. Some of those would be multiple choice fields
> like "condition of the book". A free text note field. A few Booleans (e.g.
> "Someone says this books is a classic and we may never discard it.", "Listed
> in Best Books for Acad Lib", "I propose we weed this book")
> 
> The idea for this interface would be to allow the (de)selector to make notes
> on each title as s/he goes down the shelf. The selector would be able to
> easily see bib data and would be able to change the data as the process goes
> on. I'd prefer to do this on an AJAX model so the database is updated in real
> time rather than relying on more overt form submission.
> 
> I described this as a CRUD (Create, Replace, Update, Delete) thing, but I
> guess it's really just "U" - updating.
> 
> Do you have a nice easy tool for doing AJAX-y db updating from a UI that
> would allow for the various types of input (pulldowns, Boolean, text fields).
> Preferably, I'd like to be able to do some visual renderings of some of the
> data to match our in-house "sticker" system - yellow dots for "we might weed
> this", green for "we gotta keep this".
> 
> Any ideas? I would love to not re-invent this wheel.
> 
> Thanks!
> Ken


Re: [CODE4LIB] linked data endpoints

2011-05-16 Thread Jon Gorman
Just to clarify, are you picturing some sort of feedback loop?  I'm
just trying to get a better picture of the process (sounds like an
interesting project).

In other words, do you have something like:

1) take in a full-text document (like, say, a novel?)
2) Run it through NER, pull out locations, places, things.
3) Have a user who's read the novel (or perhaps display those words in
context?) go through each the locations and pick a lat & long using
Google Maps as an interface.  (Ie says this "Dublin" is Dublin, OH not
Dublin, Ireland).
4) Do something similar with names, only using some sort of resource
like dbpedia to display possible individuals?
5) markup the original file in an XML doc w/ identifiers around those
occurrences?

Is that what you're picturing?

Jon G.

Who doesn't really know enough about linked data to contribute, but is
interested nonetheless.


[CODE4LIB] Amy Donahue is out of the office.

2011-05-16 Thread Amy Donahue
I will be out of the office starting  05/14/2011 and will not return until
05/20/2011.

I will respond to your message when I return.  If you need immediate
assistance, please e-mail auroralibrar...@aurora.org or call the Aurora St.
Luke Medical Center Library at (414) 649-7356.


[CODE4LIB] Batchload update records transfer from Millennium to OCLC

2011-05-16 Thread Elisa Graydon
Does anyone have any experience transferring records from Millennium to OCLC 
for batch load updating? I need to transfer records and every time I try, the 
transfer fails. I have called OCLC and Millennium multiple times and neither 
have been able to help me. It is not an issue with a firewall. I really can't 
figure this out so any input would be greatly appreciated!!!

Regards,

Elisa


Re: [CODE4LIB] Batchload update records transfer from Millennium to OCLC

2011-05-16 Thread Kyle Banerjee
On Mon, May 16, 2011 at 8:05 AM, Elisa Graydon  wrote:

> Does anyone have any experience transferring records from Millennium to
> OCLC for batch load updating? I need to transfer records and every time I
> try, the transfer fails. I have called OCLC and Millennium multiple times
> and neither have been able to help me. It is not an issue with a firewall. I
> really can't figure this out so any input would be greatly appreciated!!!
>

The IUG list might be the ticket for this problem as there will be more
people who perform this specific operation.

Are you trying to transfer directly from Mil to OCLC or are you downloading
to your computer first? Also, what is the procedure you are using, and where
does the failure take place (i.e. connecting to OCLC, logging in,
transferring data, processing afterwards)

kyle


Re: [CODE4LIB] EADitor: XForms for EAD beta .1105 released

2011-05-16 Thread Chris Fitzpatrick
On a related project,  I also just pushed some major code updates to the Orbeon 
xforms application we use at Stanford for MODS and TEI editing. 

It's at https://github.com/cfitz/orbeon-forms . 
It uses the Orbeon Form Runner forms environment, which you can read about 
here: http://www.orbeon.com/forms/orbeon-form-runner . 

If anyone has any questions/comments/ect. feel free to ping me...

best,chris. 


On May 16, 2011, at 6:02 AM, Ethan Gruber wrote:

> Apologies to those who may also be on the EAD list who would have already
> received this email.  EADitor is one of several active XForms projects
> detailed in "XForms for Libraries: An Introduction", an article in the 11th
> issue of the code4lib journal (http://journal.code4lib.org/articles/3916)
> 
> *
> 
> I'm pleased to announce a new, much overdue,
> EADitorbeta, .1105.
> 
> EADitor is an XForms framework for the creation and editing of Encoded
> Archival Description  (EAD) finding aids using
> Orbeon , an enterprise-level XForms Java
> application, which runs in Apache Tomcat.  Although the web form is
> certainly the most important aspect of the application since it can be
> integrated with existing content management and dissemination systems,
> EADitor also includes an easily customizable public interface for searching,
> sorting, and browsing collections of finding aids. This enables institutions
> to use a single application for content creation and publication.
> 
> FEATURES
> * Create and edit EAD finding aids adhering to the EAD 2002 schema (elements
> are represented at almost every level in the finding aid, with the notable
> exception of mixed content at the paragraph level).
> * Import EAD 2002 schema or DTD-compliant finding aids into EADitor
> * An administrative user interface for publishing/unpublishing finding aids
> * Simple component reordering interface
> * Controlled vocabulary integration with auto-suggest, including LCSH terms
> and local vocabularies in subject, persname, famname, corpname, geogname,
> and genreform.  Languages refer to controlled vocabulary also.
> * Set default templates for the EAD core and components
> * A form for setting agency codes
> * Public interface for searching, browsing, and viewing finding aids (based
> on Solr).
> * Atom feed for published finding aids
> 
> EADitor is still a work in progress, but will advance more consistently now
> that it is officially supported by the American Numismatic Society.
> Ultimately, I would like to integrate other controlled vocabulary services.
> One of the most important issues to address moving forward is better
> documentation.
> 
> MORE INFORMATION
> 
> EADitor project site (Google Code): http://code.google.com/p/eaditor/
> Installation instructions (specific for Ubuntu but broadly applies to all
> Unix-based systems):
> http://code.google.com/p/eaditor/wiki/UbuntuInstallation
> Google Group: http://groups.google.com/group/eaditor
> EADitor blog (will be the primary medium for providing information and
> progress on the project): http://eaditor.blogspot.com/


Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-16 Thread Luciano Ramalho
Mike, thanks for the link to Seth's excellent post.

I do take issue with this paragraph, though:

"""
And then we need to consider the rise of the Kindle. An ebook costs
about $1.60 in 1962 dollars. A thousand ebooks can fit on one device,
easily. Easy to store, easy to sort, easy to hand to your neighbor.
Five years from now, readers will be as expensive as Gillette razors,
and ebooks will cost less than the blades.
"""

I own a Kindle and like it very much, but that sounds like Amazon.com
PR. My points:

1) Why quote the ebook price in 1962 dollars? The reality in 2011 is
that Kindle books in general are too expensive, particularly when
comparing their cost with the paper counterparts (think about variable
costs in paperbacks, logistics etc; it is pretty obvious the cost
reductions are not being fully reflected in consumer prices). Given
the current situation, I see no evidence that ebooks will cost less
than razor blades, ever.

2) "easy to hand to your neighbor", sure, if you dont't mind being
without your entire collection, or if you have several spare Kindles
(but you are limited to sharing books among just a few Kindles). The
whole point of DRM is to hinder sharing anything with your neighbor.

3) I totally support librarians pushing for ebook lending solutions,
and not only for the sake of the future relevance of libraries, but
because I want to have better options for sharing my ebooks with my
friends (actually, anyone who does not live in the US cannot lend
Kindle ebooks at this time; meanwhile Amazon.com is very happy selling
them to us via "free" 3G).

Otherwise, a great post.

Cheers,

Luciano


On Mon, May 16, 2011 at 6:41 AM, Mike Taylor  wrote:
> Seth Godin is not a library professional -- he's a marketing guru with
> a string of best-selling books and a blog that manages to be both
> insightful AND brief on an astonishingly consistent basis.
> (http://sethgodin.typepad.com/ -- highly recommended).  So he's
> outside the library world, looking in, and has a track record of
> seeing far and clear.
>
> Which means he's probably worth paying attention to when he writes
> about The Future Of The Library, as he does in the newest post on his
> blog:
>        
> http://sethgodin.typepad.com/seths_blog/2011/05/the-future-of-the-library.html
>
> To summarise: "The library is a house for the librarian ...  [Kids]
> need a librarian more than ever (to figure out creative ways to find
> and use data). They need a library not at all ...  We need librarians
> more than we ever did. What we don't need are mere clerks who guard
> dead paper."
>
> -- Mike.
>



-- 
Luciano Ramalho
programador repentista || stand-up programmer
Twitter: @luciano