[CODE4LIB] Hours on Library Websites?

2016-07-07 Thread Matt Sherman
Hi all,

We are working on a website migration/redesign into WordPress and I am
trying to figure out an automated solution for posting and keeping up
to date the hours on the home page.  I am wondering, how do other
institutions manage this?  Are there any good tools I should be
looking into?  Any insights or suggestions are appreciated.

Matt Sherman


Re: [CODE4LIB] C4L17 - Potential Venue Shift to LA and Call for Proposals

2016-06-15 Thread Matt Sherman
Given what I remember just from the work it took for the program
committee to do our little section I cannot imagine a local planning
committee pulling it off in less time than Brian has outlined, and it
is probably tricky to do it even in that time-frame.  Thanks Brian and
the Chattanooga folks for providing a good outline to move forward.

On Wed, Jun 15, 2016 at 4:22 PM, Edward M. Corrado  wrote:
> I support the timeline proposed by Brian.
>
> Edward
>
> On Wed, Jun 15, 2016 at 2:51 PM, Sarah H Shealy  wrote:
>
>> +1
>>
>>
>> I think the timeline provided by Brian is reasonable.
>>
>>
>> But it's TN, not NC.
>>
>>
>> Sarah
>>
>> 
>> From: Code for Libraries  on behalf of Jonathan
>> Rochkind 
>> Sent: Wednesday, June 15, 2016 3:38:27 PM
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] C4L17 - Potential Venue Shift to LA and Call for
>> Proposals
>>
>> I wouldn't have even done a vote at all -- I think when we vote on
>> conference hosts, we are choosing people to steward the conference and make
>> sure it happens, as good as it can be using their judgement for what that
>> looks like and how to make it happen.  The fact that the NC folks are
>> attempting to make sure the torch can get passed instead of just throwing
>> up their hands and saying "it's back at you, community, we're no longer
>> involved" shows that stewardship was well-placed. I think it would have
>> been totally appropriate for them to simply pass the torch.
>>
>> But if votes are going to happen, they need to happen as quickly as
>> possible if you want the conf to actually come off, at least in the
>> spring.  How is "7 days after a credible proposal that includes financial
>> backing" not an "arbitrary deadline"?  Are you willing to wait forever for
>> such a "credible proposal" to show up? Who decides if it's "credible"?
>> Once a proposal shows up, anyone else that was trying to work on a proposal
>> now has exactly 7 days to get one in, but they had no idea what their
>> deadline was until the first proposal showed up, which hopefully they
>> noticed on the email list so they know what their deadline is now?  Or only
>> the first proposal to get in gets a yes/no vote, and anyone else doesn't
>> get included in the vote, first to get the proposal to email wins?
>>
>> There are a bunch of different ways it could be done, but calendar dates
>> are important for an orderly process, and speedy calendar dates are
>> important for the conf to actually happen, and I think nitpicking and
>> arguing over the process the NC folks have chosen is pointless, they were
>> entrusted to steward the thing, the process they've come up with is
>> reasonable, just go with it.
>>
>> On Wed, Jun 15, 2016 at 3:20 PM, Cary Gordon  wrote:
>>
>> > I think that we should avoid arbitrary limits such as a July 1st
>> deadline.
>> > We should open up any credible proposal that includes financial backing
>> to
>> > discussion and a vote closing seven days after the proposal is posted to
>> > this list.
>> >
>> > Cary
>> >
>> > > On Jun 15, 2016, at 12:05 PM, Brian Rogers 
>> wrote:
>> > >
>> > > Greetings once more from the Chattanooga Local Planning Committee -
>> > >
>> > > We come with another update regarding the annual Code4Lib conference.
>> > After the announcement of our survey, two other groups immediately
>> reached
>> > out about the possibility of hosting the conference. Of those two, the
>> one
>> > that is the most confident about being able to secure a fiscal host and
>> > still pull off everything within the existing timeframe, is the LA-based
>> > C4L-SoCal. We spoke with three of their members earlier in the week -
>> Gary
>> > Thompson, Christina Salazar, and Joshua Gomez. After discussion, we
>> > collectively envision a collaboration between the two groups, given the
>> > effort, energy and commitment the Chattanooga group has already invested.
>> > The LA group would handle more of the venue and local arrangements, with
>> > the Chattanooga group helping spearhead other planning elements.
>> > >
>> > > Thus, the idea is to host the annual conference in the greater LA area.
>> > >
>> > > However, even though Chattanooga's proposal was the only one put forth
>> > for next year, since this suggestion does reflect a significant change,
>> and
>> > because LA is still working on securing a fiscal host, we are proposing
>> to
>> > the community the following:
>> > >
>> > > - Since a handful of individuals came forth w/alternative cities
>> > subsequent to my last update, any group who now wishes to put forth a
>> > proposal, do so by July 1st.
>> > > - Given the specter of timecrunch, we ask anyone, including LA, who
>> > would put forth another city, to only do so with written confirmation of
>> a
>> > fiscal host by that same deadline.
>> > > - If more than one city has 

Re: [CODE4LIB] Formalizing Code4Lib? [diy]

2016-06-08 Thread Matt Sherman
Eric,

Thanks for tossing these ideas out there.  A number of these ideas had
not occurred to me, even though I've been wanting to see more local
events.  What you and Kyle are saying is resounding far more than I
would have initially thought.  I think in general one of the great
things with Code4Lib has been more of a focus on hashing out projects
and ideas, helping one another learn new things, consider new ideas
and approaches, and build relationships that way. Which having more
local meet ups would help with.  Part of me hates to see the national
conference go away as I love getting a chance to meet and interact
with so many folks from all over, but I think you have a great point
on needing to put some greater focus back into regional events and the
collaborative aspects that build this community in the first place.

Matt Sherman

On Wed, Jun 8, 2016 at 2:50 PM, Eric Hellman <e...@hellman.net> wrote:
> Since we're brainstorming...
>
> In addition to regional meetings, how about having some smaller, national or 
> even international thematic Code4Lib meetings. For example, I see an aching 
> need for a "Code4Lib:Privacy".
>
>
> Eric Hellman
> President, Free Ebook Foundation
> Founder, Unglue.it https://unglue.it/
> https://go-to-hellman.blogspot.com/
> twitter: @gluejar
>
>> On Jun 8, 2016, at 6:40 AM, Eric Lease Morgan <emor...@nd.edu> wrote:
>>
>> On Jun 8, 2016, at 1:55 AM, Kyle Banerjee <kyle.baner...@gmail.com> wrote:
>>
>>> My recollection is that in the bad 'ol days, c4l was much more about 
>>> sharing ideas to solve practical problems… Nowadays, the conference (which 
>>> has become like other library conferences) has become an end in itself…
>>
>>
>> In the spirit of open source software and open access publishing, I suggest 
>> we earnestly try to practice DIY — do it yourself -- before other types of 
>> formalization be put into place.
>>
>> I was struck by Kyle’s statement, “the conference has become an end in 
>> itself”, and the more I think about it, the more I think this has become 
>> true. The problem to solve is not identifying a fiduciary for the annual 
>> conference. The problems to solve surround communication and sharing. A 
>> (large) annual conference is not the answer to these problems, but rather it 
>> is one possible answer.
>>
>> Unless somebody steps up to the plate, then I suggest we forego the annual 
>> meeting and try a more DIY approach for a limited period of time, say two or 
>> three years. More specifically, I suggest more time & earnest effort be 
>> spent on local or regional meetings. Hosting a local/regional meeting is not 
>> difficult and relatively inexpensive. Here’s how:
>>
>>  1) Identify one or two regional leaders - These are people who will 
>> initialize and coordinate events. They find & recruit other people to 
>> participate. Sure, they require “spare cycles", but they do not have to keep 
>> this responsibility past a single event.
>>
>>  2) Create/maintain a Web presence - This is a Web page and/or a mailing 
>> list. These tools will be communication conduits. Keep the Web page 
>> up-to-date on the status of the event. Refer to it in almost every email 
>> message. Use it to record what will happen as well as what did happen. The 
>> mailing list can start out as someone’s address book, but it can grow to an 
>> mail alias on a Linux machine or even a Google Group. The Web page can live 
>> in the Code4Lib wiki.
>>
>>  3) Communicate - Kind of like voting in Chicago, “Talk early. Talk often.” 
>> This is essential, and can hardly be done too much. People delete email. 
>> People don’t plan ahead. People think they are not available, then at the 
>> last minute they are. The reverse happens too. Send communications about 
>> your event often, very often. Use email to build a local/regional community. 
>> Share with them your intention as early as Step #1. Keep people informed.
>>
>>  4) Identify a venue — Find a place to have the event. Colleges, 
>> universities, and municipal libraries are good choices. Ideally they should 
>> be associated with the output of Step #1. The meeting space has to 
>> accommodate fifty people (more or less), but bigger is not necessarily 
>> better. The space can be an auditorium, a meeting room, many meeting rooms, 
>> or any combination. The space requires excellent network connectivity. A 
>> meeting space sans strong wi-fi is detrimental.
>>
>>  5) Identify a time - The meeting itself needs to be at least one afternoon 
>> long. A day is good. More than two full days becomes a

Re: [CODE4LIB] Update Regarding C4L17 in Chattanooga

2016-06-07 Thread Matt Sherman
Just listening in, part of the discussion on Slack and IRC made it
sound like the financing was the bigger issue.

On Tue, Jun 7, 2016 at 1:14 PM, Matt Connolly  wrote:
>
> On Jun 7, 2016, at 11:26 AM, Brian Rogers 
> > wrote:
>
> We’ve determined that given this community’s commitment to providing a safe 
> and accommodating environment for all attendees, it is morally and fiscally 
> irresponsible to continue the effort of hosting the annual conference in 
> Chattanooga. This decision was not an easy one, and there were hours of 
> discussion as to the pros and cons of proceeding, informed by your responses 
> to the survey, as well as our individual opinions.
>
> The survey results clearly show that the vast majority of respondents were 
> not interested in boycotting Code4Lib Chattanooga. What number would have 
> inclined you to proceed, if a 75% affirmative vote wasn’t positive enough?
>
> — Matt
>
>
> -
> Matt Connolly
> Applications developer, CUL-IT
> 218 Olin Library
> Cornell University
> (607) 255-0653


Re: [CODE4LIB] Anything Interesting Going on in Archival Metadata?

2016-05-24 Thread Matt Sherman
Thanks for the info. Good to see that there are cool things going on in
archival metadata as well.
On May 24, 2016 2:04 PM, "Charles Blair"  wrote:

> I've been applying the Europeana Data Model with some success to
> digital archives. Some work has already been done in this area:
>
> Casarosa, Vittore; Meghini, Carlo; Gardasevic,
> Stanislava. (2013). “Improving Online Access to Archival
> Data”. Digital Libraries & Archives, pp. 153-162.
>
> Hennicke, Steffen; Olensky Marlies; de Boer, Victor; Isaac Antoine;
> Wielemaker, Jan. (2011). “Conversion of EAD into EDM Linked Data”. In:
> Proceedings of the 1st International Workshop on Semantic Digital
> Archives. .
>
> See also:
>
> Gardasevic, Stanislava. (2011). “Opening Archives to the General
> Public, a data modelling approach”. Master thesis. International
> Master in Digital Library Learning.
>
> --
> Charles Blair, Director, Digital Library Development Center, University of
> Chicago Library
> 1 773 702 8459 | c...@uchicago.edu | http://www.lib.uchicago.edu/~chas/
>


[CODE4LIB] Anything Interesting Going on in Archival Metadata?

2016-05-24 Thread Matt Sherman
Hi all,

I was recently talking with some folks about some archives related
things and realized that while I've heard a lot recently about
different projects, advancements, and issues within library specific
metadata, and its associated concerns, I have not heard as much
recently about metadata in the archives realm.  Is there much going on
there?  Is linked data even useful in a setting with extremely unique
materials?  Is this a stupid question?  I don't know, but I am curious
to hear if there are any interesting things people are doing in
archival metadata or any challenges folks are working to overcome.

Matt Sherman


Re: [CODE4LIB] Good Database Software for a Digital Project?

2016-04-16 Thread Matt Sherman
Thanks for all the advice folks, this gives me a lot to look into.  You all
have certainly made me table MySQL, so now to look into PostgreSQL, Solr,
XTF, and some of these other technologies to see what would be the best
fit.  It is always so helpful pinging this group as you all have so many
helpful suggestions.

On Sat, Apr 16, 2016 at 6:09 AM, Jean-Claude Dauphin <jc.daup...@gmail.com>
wrote:

> Hi Matt,
>
> You may wish to give a try to J-ISIS
>
> https://kenai.com/projects/j-isis/downloads
>
> With J-ISIS, you can create a searchable database with a couple of clicks.
> It uses Berkeley Database as persistence manager and Lucene for indexing
> and searching.
> The user can concentrate on the domain and to what he want to achieve. No
> need to be an expert in relational dabases and SQL. Furthermore, you get
> suggestions of term indexed when making a particular query.
>
> Web-JISIS is a web application prototype that allows to browse and search
> J-ISIS databases.
>
> I can help if you need.
>
> Best wishes,
>
> Jean-Claude
>
>
> On Sat, Apr 16, 2016 at 2:07 AM, Matt Sherman <matt.r.sher...@gmail.com>
> wrote:
>
> > Well, we've got one volume done, with about 1,250 bibliographies, but
> there
> > are 3 other volumes to convert. So at the end of the day probably about
> > 5,000 entries.  Though the how is to make it intractable via the web and
> > hopefully letting scholars in the field continue to add to the database
> > once it is online.
> >
> > On Fri, Apr 15, 2016 at 7:38 PM, Kyle Banerjee <kyle.baner...@gmail.com>
> > wrote:
> >
> > > On Fri, Apr 15, 2016 at 11:53 AM, Roy Tennant <roytenn...@gmail.com>
> > > wrote:
> > >
> > > > In my experience, for a number of use cases, including possibly this
> > one,
> > > > a database is overkill. Often, flat files in a directory system
> indexed
> > > by
> > > > something like Solr is plenty and you avoid the inevitable headaches
> of
> > > > being a database administrator. Backup, for example, is a snap and
> > easily
> > > > automated.
> > > >
> > >
> > > I'm with Roy -- no need to use a chain saw to cut butter.
> > >
> > > Out of curiosity, since the use case is an annotated bibliography, how
> > much
> > > stuff do you have? If you have only a few thousand entries in delimited
> > > text, flat files could be easier and more effective than other options.
> > >
> > > kyle
> > >
> >
>
>
>
> --
> Jean-Claude Dauphin
>
> jc.daup...@gmail.com
>
>
> http://kenai.com/projects/j-isis/
> http://www.unesco.org/isis/
> http://www.unesco.org/idams/
> http://www.greenstone.org
>


Re: [CODE4LIB] Good Database Software for a Digital Project?

2016-04-15 Thread Matt Sherman
Well, we've got one volume done, with about 1,250 bibliographies, but there
are 3 other volumes to convert. So at the end of the day probably about
5,000 entries.  Though the how is to make it intractable via the web and
hopefully letting scholars in the field continue to add to the database
once it is online.

On Fri, Apr 15, 2016 at 7:38 PM, Kyle Banerjee 
wrote:

> On Fri, Apr 15, 2016 at 11:53 AM, Roy Tennant 
> wrote:
>
> > In my experience, for a number of use cases, including possibly this one,
> > a database is overkill. Often, flat files in a directory system indexed
> by
> > something like Solr is plenty and you avoid the inevitable headaches of
> > being a database administrator. Backup, for example, is a snap and easily
> > automated.
> >
>
> I'm with Roy -- no need to use a chain saw to cut butter.
>
> Out of curiosity, since the use case is an annotated bibliography, how much
> stuff do you have? If you have only a few thousand entries in delimited
> text, flat files could be easier and more effective than other options.
>
> kyle
>


Re: [CODE4LIB] Good Database Software for a Digital Project?

2016-04-15 Thread Matt Sherman
It is OCRed text that we've forced in the a delimited text file format.  So
there are a lot of ways we can spin it.  I am just not as familiar with the
storage/query systems we could put it in.

On Fri, Apr 15, 2016 at 2:44 PM, Gregory Murray <gpmurra...@gmail.com>
wrote:

> Matt,
>
> If the annotated bibliography is already in XML form, or if the data it is
> suited to a hierarchical structure, you may want to consider using a
> native XML database (the most common open-source ones are eXist and BaseX)
> and querying it with XQuery.
>
> Greg
>
>
>
> On 4/15/16, 2:18 PM, "Code for Libraries on behalf of Matt Sherman"
> <CODE4LIB@LISTSERV.ND.EDU on behalf of matt.r.sher...@gmail.com> wrote:
>
> >Hi all,
> >
> >I am looking to pick the group brain as to what might be the most useful
> >database software for a digital project I am collaborating on.  We are
> >working on converting an annotated bibliography to a searchable database.
> >While I have the data in a few structured formats, we need to figure out
> >now what to actually put it in so that it can be queried.  My default line
> >of thinking is to try a MySQL since it is free and used ubiquitously
> >online, but I wanted to see if there were any other database or software
> >systems that we should also consider before investing a lot of time in one
> >approach.  Any advice and suggestions would be appreciated.
> >
> >Matt Sherman
>


Re: [CODE4LIB] Good Database Software for a Digital Project?

2016-04-15 Thread Matt Sherman
Well, this is a side project with just 2 of us working on it, and I have
the tech skills so it is more of what I need to learn to make it work.

On Fri, Apr 15, 2016 at 2:22 PM, Ethan Gruber <ewg4x...@gmail.com> wrote:

> There are countless ways to approach the problem, but I suggest beginning
> with tools that are within the area of expertise of your staff. Mapping
> disparate structured formats into a single Solr instance for fast search
> and retrieval is one possibility.
>
> On Fri, Apr 15, 2016 at 2:18 PM, Matt Sherman <matt.r.sher...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I am looking to pick the group brain as to what might be the most useful
> > database software for a digital project I am collaborating on.  We are
> > working on converting an annotated bibliography to a searchable database.
> > While I have the data in a few structured formats, we need to figure out
> > now what to actually put it in so that it can be queried.  My default
> line
> > of thinking is to try a MySQL since it is free and used ubiquitously
> > online, but I wanted to see if there were any other database or software
> > systems that we should also consider before investing a lot of time in
> one
> > approach.  Any advice and suggestions would be appreciated.
> >
> > Matt Sherman
> >
>


[CODE4LIB] Good Database Software for a Digital Project?

2016-04-15 Thread Matt Sherman
Hi all,

I am looking to pick the group brain as to what might be the most useful
database software for a digital project I am collaborating on.  We are
working on converting an annotated bibliography to a searchable database.
While I have the data in a few structured formats, we need to figure out
now what to actually put it in so that it can be queried.  My default line
of thinking is to try a MySQL since it is free and used ubiquitously
online, but I wanted to see if there were any other database or software
systems that we should also consider before investing a lot of time in one
approach.  Any advice and suggestions would be appreciated.

Matt Sherman


Re: [CODE4LIB] code4lib mailing list

2016-03-24 Thread Matt Sherman
Sounds like a reasonable plan, so we might as well give it a shot.  Thanks
for all your hard work on this.

On Thu, Mar 24, 2016 at 4:28 PM, Cary Gordon  wrote:

> You can get enough server for this from AWS for $5-10/mo.
>
> Cary
>
> > On Mar 24, 2016, at 1:13 PM, Thomas Krichel  wrote:
> >
> >  Paul Hoffman writes
> >
> >> If you're interested, Eric, I have some experience with Mailman (though
> >> not with Listserv) and would be happy if I can -- I have some scripts to
> >> do bulk operations (add or remove subscribers, etc.) and could also help
> >> to migrate the list archive.
> >
> >  I find that this is the most important contribution I have seen here
> >  in this thread.
> >
> >  I have run Mailman over ten years for NEP
> >
> > http://nep.repec.org
> >
> >  I am also running it for NYLUG
> >
> > http://mail.nylug.org/mailman/listinfo
> >
> >  It's not just a case of running a box that has Mailman on it.  It's
> >  also important to have an infrastructure that sends bulk email and
> >  that is not landing up in spam filters. And it's a matter of
> >  spam filtering on the list email sending box. The NEP server has a
> >  sender score
> >
> > https://www.senderscore.org/
> >
> >  score of 99/100 last time I looked but you don't get there
> instantaneously.
> >
> >  You also need a hoster that is email friendly.
> >
> >  So the list of tasks as I see it is
> >
> > 1. Find a sponsor for a dedicated root server, have them pay for the
> >   server.  You can get a server for about $50 a month.
> >
> > 2. Decide on a domain and set up access for server admin
> >   to domain records, including SPF and DKIM.
> >
> > 3. Set up the server with linux.
> >
> > 4. Set email software (exim or postfix or ...) and mailman or sympa, as
> >   well as say spam assassin.
> >
> > 5. Migrate members and email archives.
> >
> >  For somebody who knows what (s)he is doing 2-4 is not a big deal
> >  but it needs a few hours of work and a commitment to some maintenance.
> >  5 is the job that dwarfs everything else. But if Paul is volunteering
> >  (or could be sponsored) to lead that forward then you have a realistic
> >  case to run it on a community and open-source base.
> >
> > --
> >
> >  Cheers,
> >
> >  Thomas Krichel  http://openlib.org/home/krichel
> >  skype:thomaskrichel
>


Re: [CODE4LIB] code4lib mailing list

2016-03-24 Thread Matt Sherman
I have no technical answers to the questions you pose, but I second Option
#2.

On Thu, Mar 24, 2016 at 5:29 AM, Eric Lease Morgan  wrote:

> Alas, the Code4Lib mailing list software will most likely need to be
> migrated before the end of summer, and I’m proposing a number possible
> options for the lists continued existence.
>
> I have been managing the Code4Lib mailing list since its inception about
> twelve years ago. This work has been both a privilege and an honor. The
> list itself runs on top of the venerable LISTSERV application and is hosted
> by the University of Notre Dame. The list includes about 3,500 subscribers,
> and traffic very very rarely gets over fifty messages a day. But alas,
> University support for LISTSERV is going away, and I believe the University
> wants to migrate the whole kit and caboodle to Google Groups.
>
> Personally, I don’t like the idea of Code4Lib moving to Google Groups.
> Google knows enough about me (us), and I don’t feel the need for them to
> know more. Sure, moving to Google Groups includes a large convenience
> factor, but it also means we have less control over our own computing
> environment, let alone our data.
>
> So, what do we (I) do? I see three options:
>
>   0. Let the mailing list die — Not really an option, in my opinion
>   1. Use Google Groups - Feasible, (probably) reliable, but with less
> control
>   2. Host it ourselves - More difficult, more responsibility, all but
> absolute control
>
> Again, personally, I like Option #2, and I would probably be willing to
> host the list on my one of my computers, (and after a bit of DNS trickery)
> complete with a code4lib.org domain.
>
> What do y’all think? If we go with Option #2, then where might we host the
> list, who might do the work, and what software might we use?
>
> —
> Eric Lease Morgan
> Artist- And Librarian-At-Large
>


Re: [CODE4LIB] Institutional repositories

2016-03-21 Thread Matt Sherman
Hi Katie,

We make use of DSpace at our institution.  It works pretty well, though you
do need to have some amount of IT support since it is open source. That
said a lot of what you go with depends on what is going in.  If it is
largely text materials then DSpace or Digital Commons are both good
options.  If you are going to be putting in non-text materials like data
sets then seriously consider Fedora Hydra, actually you might want to just
look into this one anyway since a lot of places are moving to Fedora.

Matt Sherman

On Mon, Mar 21, 2016 at 9:41 AM, Knight, Kathryn E. <knigh...@ornl.gov>
wrote:

> Hello all,
>
> My institution is working on a massive overhaul of our current
> institutional repository. At this point we're still deciding what to choose
> (DSpace, Invenio, etc.). Since I don't have much experience with IRs and so
> far all I can do in meetings is wave my arms and crow about metadata a
> bunch, I thought I'd appeal to the collective Code4Lib brain for some
> repository input. If you have an IR at your institution, what do you like
> about it? Hate? What about the end users? What is your submission process
> like? Anything you wish it could do that it doesn't? Etc.
>
> Please feel free to contact me off list with your thoughts, if you care to
> share-I'll keep all information confidential.
>
> Thanks so much,
>
> Katie
>
> Kathryn Knight
> Metadata and Cataloging Librarian
> Oak Ridge National Laboratory Research Library
>


Re: [CODE4LIB] Anyone Doing Interesting Things With Digital Collection Systems?

2016-02-29 Thread Matt Sherman
Thanks for the plethora of links and responses.  There are some great
things here to look through.

On Mon, Feb 29, 2016 at 11:52 AM, Peter Murray  wrote:

> Nice!  I particularly like the indication of the content type in the lower
> right corner of the thumbnail...
>
>
> Peter
>
> > On Feb 29, 2016, at 11:12 AM, Erica FINDLEY  wrote:
> >
> > We just designed our own responsive site at Multnomah County Library for
> > digital collections that is also OAI-PMH compatible. We call it The
> > Gallery. https://gallery.multcolib.org/
> >
> > Erica
> >
> >
> > *Erica Findley*
> > Cataloging/Metadata Librarian
> > Multnomah County Library
> > Phone: 503.988.5466
> > eri...@multcolib.org
> > multcolib.org 
> >
> > On Mon, Feb 29, 2016 at 4:29 AM, Scancella, John  wrote:
> >
> >> Hi Matt,
> >>
> >> I work on the digital repository for the Library of Congress. We have a
> >> lot of our tools on our public github
> >> https://github.com/LibraryOfCongress
> >>
> >> Of particular interest would be the bagit-python, and bagit-java. Note
> >> that for bagit-java we are in the middle of a rewrite so if you plan on
> >> using it for more than the near term you should check out the
> >> https://github.com/LibraryOfCongress/bagit-java/tree/rewrite branch or
> >> BETA release
> >> http://search.maven.org/#artifactdetails|gov.loc|bagit|5.0.0-BETA|jar
> >>
> >> John
> >> Please note: all opinions expressed in this email are my own and do not
> >> reflect those of The Library Of Congress
> >>
> >> -Original Message-
> >> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> >> Erin Tripp
> >> Sent: Monday, February 29, 2016 7:19 AM
> >> To: CODE4LIB@LISTSERV.ND.EDU
> >> Subject: Re: [CODE4LIB] Anyone Doing Interesting Things With Digital
> >> Collection Systems?
> >>
> >> Hi Matt,
> >>
> >> The Islandora Community (http://islandora.ca/about) is releasing some
> >> lovely open source digital repositories. Islandora is interoperable and
> >> extensible through the Tuque API, the Islandora OAI module, and many
> other
> >> tools that are included in the software stack.
> >>
> >> Here are a few repositories to explore:
> >> http://dcmny.org/
> >> http://dlib.bc.edu/
> >> http://repository.lib.cuhk.edu.hk/
> >> http://arcabc.ca/
> >>
> >> We have monthly webinars on Islandora if you'd like to join and learn
> more.
> >>
> >> ~ Erin
> >>
> >> Erin Tripp, BJH MLIS
> >> Business Development Manager
> >> discoverygarden inc.
> >> e...@discoverygarden.ca
>
>
> --
> Peter Murray
> Dev/Ops Lead and Project Manager, Cherry Hill Company
> Blogger, Disruptive Library Technology Jester - http://dltj.org/
>


Re: [CODE4LIB] Anyone Doing Interesting Things With Digital Collection Systems?

2016-02-27 Thread Matt Sherman
I'm good with shameless plugs, I was hoping for some to see what awesome
stuff people are working on. This does look pretty cool. Just skimming it
on the train home I really appreciate the responsiveness. I could see where
you could cross walk the bibliographic metadata without much trouble. The
content metadata would harder, though find some folks who want to work with
TEI and you might have some fun with it. Thanks for sharing.
On Feb 27, 2016 5:14 PM, "Gregory Murray" <gpmurra...@gmail.com> wrote:

> Matt,
>
> Please have a look at the Theological Commons at Princeton Seminary:
>
> http://commons.ptsem.edu/
>
> It's responsive. Unfortunately we don't have OAI-PMH set up (someday).
> Currently the only "API" is that if you take a URL like
> http://commons.ptsem.edu/id/... and replace "id" with "xml" you get the
> underlying XML document (which is a home-grown schema, not a standard
> library one, embarrassingly).
>
> (End of shameless plug.)
>
> Thanks,
> Greg
>
> Gregory Murray
> Director of Academic Technology and Digital Scholarship Services
> Princeton Theological Seminary Library
> gregory.mur...@ptsem.edu
>
>
>
> On 2/27/16, 4:26 PM, "Code for Libraries on behalf of Matt Sherman"
> <CODE4LIB@LISTSERV.ND.EDU on behalf of matt.r.sher...@gmail.com> wrote:
>
> >Hi all,
> >
> >I am asking about interesting digital collection tech due to some personal
> >research I am doing.  I have looked a bunch of digital collection sites
> >lately and outside of NYPL <http://digitalcollections.nypl.org/>, I have
> >mostly seen bland, non-responsive but functional CONTENTdm sites or old
> >late 90s early 2000s static HTML exhibit sites.  Given the kind of web
> >tools and UX methods we have now I am curious if people can point me to,
> >or
> >tell me about, more interesting user friendly designs/systems?  I see talk
> >of responsive design and data interoperability via OAI-PMH and APIs, but I
> >must be looking in the wrong places as I am seeing very little evidence of
> >it being put into action.  If anyone can point me to more interesting
> >pastures I would appreciate it.
> >
> >Matt Sherman
>


[CODE4LIB] Anyone Doing Interesting Things With Digital Collection Systems?

2016-02-27 Thread Matt Sherman
Hi all,

I am asking about interesting digital collection tech due to some personal
research I am doing.  I have looked a bunch of digital collection sites
lately and outside of NYPL <http://digitalcollections.nypl.org/>, I have
mostly seen bland, non-responsive but functional CONTENTdm sites or old
late 90s early 2000s static HTML exhibit sites.  Given the kind of web
tools and UX methods we have now I am curious if people can point me to, or
tell me about, more interesting user friendly designs/systems?  I see talk
of responsive design and data interoperability via OAI-PMH and APIs, but I
must be looking in the wrong places as I am seeing very little evidence of
it being put into action.  If anyone can point me to more interesting
pastures I would appreciate it.

Matt Sherman


Re: [CODE4LIB] Code4lib 2016 Registration is Closed

2015-12-11 Thread Matt Sherman
We have round 2 to try and break it.

On Fri, Dec 11, 2015 at 11:05 AM, Becky Yoose  wrote:
> I, on the other hand, was very disappointed and sad that this year's
> code4lib has broken with the tradition of breaking a
> registration/reservation system during the rush.
>
> On Fri, Dec 11, 2015 at 7:56 AM, Collier, Aaron 
> wrote:
>
>> +1
>>
>> That was super smooth. Excellent work!
>>
>> --
>> Aaron Collier
>> Digital Repository Services Manager
>> Systemwide Digital Library Services, California State University
>> 
>> From: Code for Libraries  on behalf of Fox,
>> Bobbi 
>> Sent: Friday, December 11, 2015 6:17 AM
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] Code4lib 2016 Registration is Closed
>>
>> Dear 2016 Code4lib Planning Committee
>>
>> Kudos for the smoothest Code4Lib registration process *I've* ever
>> experienced!
>> Cheers,
>> Bobbi
>>
>> > -Original Message-
>> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
>> > David Lacy
>> > Sent: Thursday, December 10, 2015 8:20 PM
>> > To: CODE4LIB@LISTSERV.ND.EDU
>> > Subject: [CODE4LIB] Code4lib 2016 Registration is Closed
>> >
>> > Dear Code4lib Community,
>> >
>> > The first wave of registration for Code4lib 2016 is officially closed.
>> >
>> > In the next several weeks will we finalize the conference program and
>> > presenters. Shortly after the Holidays, all sponsors, presenters, and
>> > workshop facilitators will be notified privately regarding their
>> registration
>> > status. Once all required attendees have been reconciled, the second wave
>> > of registration will be announced.
>> >
>> > Phew...
>> >
>> > - The 2016 Code4lib Planning Committee
>> >
>> > David Lacy
>> > Team Leader, Falvey Library Technology Development
>> > Villanova University
>> > library.villanova.edu
>> > 610-519-7361
>>


Re: [CODE4LIB] code4lib chicago

2015-09-02 Thread Matt Sherman
You realize that this now needs to be an event at the the conference in
Philly next year.

On Wed, Sep 2, 2015 at 3:23 PM, Cary Gordon  wrote:

> Catalog that!
>
> On Wednesday, September 2, 2015, Eric Lease Morgan  wrote:
>
> > On Sep 2, 2015, at 12:04 PM, Cary Gordon  > > wrote:
> >
> > > http://cod4lib.com
> >
> >  ROTFL!!! —Eric Morgan
> >
>
>
> --
> Cary Gordon
> The Cherry Hill Company
> http://chillco.com
>


Re: [CODE4LIB] "coders for libraries"

2015-09-01 Thread Matt Sherman
As one who doesn't spend their day neck deep in compilers I will also vote
that this is a good idea.

On Tue, Sep 1, 2015 at 10:24 AM, David Mayo  wrote:

> ++ as well from me.
>
> On an unrelated note: as long as someone's in there changing stuff,
> changing the favicon away from the default Drupal one would be nice.
>
> On Tue, Sep 1, 2015 at 9:46 AM, Eric Lease Morgan  wrote:
>
> > On Sep 1, 2015, at 9:42 AM, Eric Hellman  wrote:
> >
> > > As someone who feels that Code4Lib should welcome people who don't
> > particularly identify as "coders", I would welcome a return to the
> previous
> > title attribute.
> >
> >   1++ because I believe it is more about libraries than it is about code.
> > —ELM
> >
>


Re: [CODE4LIB] "coders for libraries"

2015-09-01 Thread Matt Sherman
Time to prepare for the classification system wars of 2075.

On Tue, Sep 1, 2015 at 11:41 AM, Jason Bengtson 
wrote:

> "Code4Lib | total world domination by libraries, courtesy of code peeps"
>
> Now that one, I like!
>
> Best regards,
> *Jason Bengtson, MLIS, MA*
> Innovation Architect
>
>
> *Houston Academy of MedicineThe Texas Medical Center Library*
> 1133 John Freeman Blvd
> Houston, TX   77030
> http://library.tmc.edu/
> www.jasonbengtson.com
>
> On Tue, Sep 1, 2015 at 10:39 AM, Cary Gordon  wrote:
>
> > Code4Lib | total world domination by libraries, courtesy of code peeps
> >
> > > On Sep 1, 2015, at 8:18 AM, Eric Hellman  wrote:
> > >
> > > Code4Lib | You can't spell 'Library' without 'x4C'
> > >> On Sep 1, 2015, at 10:58 AM, Mark A. Matienzo <
> mark.matie...@gmail.com>
> > wrote:
> > >> How about if we turn this topic around and focus on thinking about
> > coming
> > >> up with a tagline that emphasizes our goals for inclusivity rather
> than
> > >> identity?
> > >>
> > >> Mark
> > >>
> > >> --
> > >> Mark A. Matienzo  | http://anarchivi.st/
> >
>


Re: [CODE4LIB] Code4Lib 2016: any date(s) yet?

2015-08-10 Thread Matt Sherman
The save the date e-mail said:

The 2016 conference will be held from March 7 through March 10 in the Old
City District of Philadelphia

On Mon, Aug 10, 2015 at 12:28 PM, Ranti Junus ranti.ju...@gmail.com wrote:

 Hi,

 Any dates set for Code4Lib 2016 yet?  I'm working on professional
 development stuff and all that jazz for this fiscal year so knowing the
 dates would be helpful for the planning.

 I'm sorry if the dates have been announced and I missed it.

 thanks,
 ranti.



Re: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

2015-08-04 Thread Matt Sherman
Hm, doing a little looking on someone's suggestion it turns out I was
wrong, they are not line breaks, they are paragraph marks.

On Tue, Aug 4, 2015 at 9:21 AM, Scancella, John j...@loc.gov wrote:
 Matt,

 A word document does funny things to the text since it is actually html (try 
 opening a .doc in a plain text editor and you will see it is html). I would 
 try and get the plain ASCII text instead, and then install Cygwin which 
 contains Sed and a bunch of other usful Unix/Linux commands.
 see http://stackoverflow.com/a/127567/2896744 for more info.
 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt Sherman 
 [matt.r.sher...@gmail.com]
 Sent: Tuesday, August 04, 2015 9:09 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

 I am on Windows machines, so I don't have quite the easy access to
 that useful command.  Someone had earlier put the OCR in a doc file so
 I've been playing with that more than with the raw PDF OCR.

 On Tue, Aug 4, 2015 at 8:19 AM, Scancella, John j...@loc.gov wrote:
 Matt,

 There are probably a dozen ways to do this, but it would be really helpful 
 to know what operating system you are on? For example, if you are using 
 Linux, you can run it through sed using
   cat OCR_FILE | sed 's/\n//'  STRIPPED_OCR_FILE
 see http://stackoverflow.com/a/800644/2896744 for more info
 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt 
 Sherman [matt.r.sher...@gmail.com]
 Sent: Monday, August 03, 2015 10:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

 Hi Code4Lib folks,

 I was wondering if anyone had some experience cleaning up OCR text.
 Particularly I am trying to figure out how I can deal with the random
 line breaks that come from OCR.  I am trying to parse out a
 bibliography with regex.  I think I've figured out which queries I
 need to run to break it up so I can make it into a tab delimited text
 file but I noticed that the text does the classic thing of OCR
 inserting line breaks where they physically are on the page.  This
 will obviously be a bit of an issue since it would break the
 annotation into a bunch of lines rather than leaving it one block so I
 can manipulate it into a database.  So I am wondering if anyone who
 has worked with OCR text before has a suggested way to clean up those
 line breaks without doing 300 + pages by hand?  Any thoughts would be
 welcome.

 Matt Sherman


Re: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

2015-08-04 Thread Matt Sherman
I am on Windows machines, so I don't have quite the easy access to
that useful command.  Someone had earlier put the OCR in a doc file so
I've been playing with that more than with the raw PDF OCR.

On Tue, Aug 4, 2015 at 8:19 AM, Scancella, John j...@loc.gov wrote:
 Matt,

 There are probably a dozen ways to do this, but it would be really helpful to 
 know what operating system you are on? For example, if you are using Linux, 
 you can run it through sed using
   cat OCR_FILE | sed 's/\n//'  STRIPPED_OCR_FILE
 see http://stackoverflow.com/a/800644/2896744 for more info
 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt Sherman 
 [matt.r.sher...@gmail.com]
 Sent: Monday, August 03, 2015 10:29 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

 Hi Code4Lib folks,

 I was wondering if anyone had some experience cleaning up OCR text.
 Particularly I am trying to figure out how I can deal with the random
 line breaks that come from OCR.  I am trying to parse out a
 bibliography with regex.  I think I've figured out which queries I
 need to run to break it up so I can make it into a tab delimited text
 file but I noticed that the text does the classic thing of OCR
 inserting line breaks where they physically are on the page.  This
 will obviously be a bit of an issue since it would break the
 annotation into a bunch of lines rather than leaving it one block so I
 can manipulate it into a database.  So I am wondering if anyone who
 has worked with OCR text before has a suggested way to clean up those
 line breaks without doing 300 + pages by hand?  Any thoughts would be
 welcome.

 Matt Sherman


Re: [CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

2015-08-04 Thread Matt Sherman
That worked pretty well.  There is still come clean up I have to do
but [A-z]^p[A-z] to [A-z] [A-z] did a lot of the cleanup.

On Tue, Aug 4, 2015 at 12:17 PM, Kyle Banerjee kyle.baner...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 6:09 AM, Matt Sherman matt.r.sher...@gmail.com
 wrote:

 I am on Windows machines, so I don't have quite the easy access to
 that useful command.  Someone had earlier put the OCR in a doc file so
 I've been playing with that more than with the raw PDF OCR.


 Versions of the unix utilities that run on Windows are available, but you
 can just use Microsoft Word to do what you want. Just use the find/replace
 function. In Word, you can search for a paragraph marker by looking for
 ^p (caret p)

 Because you undoubtedly have real paragraphs in the document which you
 don't want to remove, I'd recommend substituting double paragraph marks
 with something unique (e.g. @ZZZ@) before replacing all the other
 paragraph marks with a space. Then replace your unique marker with a
 paragraph.

 HTH,

 kyle


[CODE4LIB] Looking for Ideas on Line Breaks in OCR Text

2015-08-03 Thread Matt Sherman
Hi Code4Lib folks,

I was wondering if anyone had some experience cleaning up OCR text.
Particularly I am trying to figure out how I can deal with the random
line breaks that come from OCR.  I am trying to parse out a
bibliography with regex.  I think I've figured out which queries I
need to run to break it up so I can make it into a tab delimited text
file but I noticed that the text does the classic thing of OCR
inserting line breaks where they physically are on the page.  This
will obviously be a bit of an issue since it would break the
annotation into a bunch of lines rather than leaving it one block so I
can manipulate it into a database.  So I am wondering if anyone who
has worked with OCR text before has a suggested way to clean up those
line breaks without doing 300 + pages by hand?  Any thoughts would be
welcome.

Matt Sherman


Re: [CODE4LIB] Regex Question

2015-07-09 Thread Matt Sherman
Thanks for the advice everyone.  This is all helpful stuff that I need to
spend some time with.

On Thu, Jul 9, 2015 at 3:38 AM, Kool,Wouter wouter.k...@oclc.org wrote:

 I also recommend this site: http://www.regular-expressions.info/
 If you do not want to work inside MSWord and want to use only regexes not
 xpath, you could of course do something like:

 italics.*[A-Z ,;:]+.*/italics

 But, depending on your environment, you might be troubles by newlines in
 the data (regex engines tend to chunk your data, and they tend to use
 newlines by default).

 If you just want to list the titles you could grab the title proper like:

 italics.*([A-Z ,;:]+).*/italics. The part between ( and ) is then
 usually accessible as $1 (in a language like Perl) or \1 (in a text editor).

 Wouter



 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Harper, Cynthia
 Sent: woensdag 8 juli 2015 19:51
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Regex Question

 I like this regex add-in for Excel:
 http://www.codedawn.com/index/new-excel-add-in-regex-find-replace
 Cindy Harper

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Kyle Banerjee
 Sent: Tuesday, July 07, 2015 6:22 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Regex Question

 For clarity, Word does regex, not just wildcards.  It's not quite as
 complete as what you'd get with some other environments such as OpenOffice
 Writer since matching is lazy rather than greedy which can be a big deal
 depending on what you're doing and there are a couple other catches --
 notably no support for | -- but it's reasonably powerful. There is no
 regexp capability in Excel unless you're willing to use VBA.

 kyle

 On Tue, Jul 7, 2015 at 1:10 PM, Gordon, Bonnie bgor...@rockarch.org
 wrote:

  OpenOffice Writer (or a similar program) may be useful for this. It
  would allow you to search by format while using a more controlled
  regular expression than MS Word's wildcards.
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
  Of Matt Sherman
  Sent: Tuesday, July 07, 2015 12:45 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Regex Question
 
  Thanks everyone, this really helps.  I'll have to work out the
  italicized stuff, but this gets me much closer.
 
  On Tue, Jul 7, 2015 at 12:43 PM, Kyle Banerjee
  kyle.baner...@gmail.com
  wrote:
 
   Y'all are doing this the hard way. Word allows regex replacements as
   well as format based criteria.
  
   For this particular use case:
  
  1. Open the find/replace dialog (CTL+H)
  2. In the Find what box, put (*) -- make sure the option for
 Use
  Wildcards is selected, and for the format, specify italic
  3. For theReplace box, just put \1 and specify All caps
  
   And you're done
  
   kyle
  
   On Tue, Jul 7, 2015 at 9:32 AM, Thomas Krichel kric...@openlib.org
   wrote:
  
  Eric Phetteplace writes
   
 You can match a string of all caps letters like [A-Z]
   
  This works if you are limited to English. But in a multilingual
  setting, you need to watch out for other uppercases, such as
  крихель vs КРИХЕЛЬ. It then depends in the unicode implementation
  of your regex application. In Perl, for example, you would use
  [[:upper:]].
   
   
--
   
  Cheers,
   
  Thomas Krichel  http://openlib.org/home/krichel
  skype:thomaskrichel
   
  
 



Re: [CODE4LIB] Regex Question

2015-07-07 Thread Matt Sherman
Thanks everyone, this really helps.  I'll have to work out the italicized
stuff, but this gets me much closer.

On Tue, Jul 7, 2015 at 12:43 PM, Kyle Banerjee kyle.baner...@gmail.com
wrote:

 Y'all are doing this the hard way. Word allows regex replacements as well
 as format based criteria.

 For this particular use case:

1. Open the find/replace dialog (CTL+H)
2. In the Find what box, put (*) -- make sure the option for Use
Wildcards is selected, and for the format, specify italic
3. For theReplace box, just put \1 and specify All caps

 And you're done

 kyle

 On Tue, Jul 7, 2015 at 9:32 AM, Thomas Krichel kric...@openlib.org
 wrote:

Eric Phetteplace writes
 
   You can match a string of all caps letters like [A-Z]
 
This works if you are limited to English. But in a multilingual
setting, you need to watch out for other uppercases, such as
крихель vs КРИХЕЛЬ. It then depends in the unicode implementation
of your regex application. In Perl, for example, you would use
[[:upper:]].
 
 
  --
 
Cheers,
 
Thomas Krichel  http://openlib.org/home/krichel
skype:thomaskrichel
 



[CODE4LIB] Definitional Question

2015-07-02 Thread Matt Sherman
Hi all,

This is a bit more philosophical question which might only apply to a few
people but I am trying to work out some definitions for my own
edification.  So for those in the digital scholarship and digital
humanities subset I would be interested in getting some thoughts on these
three questions:

1) How would you define digital scholarship?

2) How would you define digital humanities?

3) Are they the same thing and why or why not?

Any thoughts are appreciated as I am trying to think through this myself.

Matt Sherman


Re: [CODE4LIB] Desiring Advice for Converting OCR Text into Metadata and/or a Database

2015-06-18 Thread Matt Sherman
That is a pretty good summation of it yes.  I appreciate the suggestions,
this is a bit of a new realm for me and while I know what I want it to do
and the structure I want to put it in, the conversion process has been
eluding me so thanks for giving me some tools to look into.

On Thu, Jun 18, 2015 at 1:04 PM, Eric Lease Morgan emor...@nd.edu wrote:

 On Jun 18, 2015, at 12:02 PM, Matt Sherman matt.r.sher...@gmail.com
 wrote:

  I am working with colleague on a side project which involves some scanned
  bibliographies and making them more web searchable/sortable/browse-able.
  While I am quite familiar with the metadata and organization aspects we
  need, but I am at a bit of a loss on how to automate the process of
 putting
  the bibliography in a more structured format so that we can avoid going
  through hundreds of pages by hand.  I am pretty sure regular expressions
  are needed, but I have not had an instance where I need to automate
  extracting data from one file type (PDF OCR or text extracted to Word
 doc)
  and place it into another (either a database or an XML file) with some
  enrichment.  I would appreciate any suggestions for approaches or tools
 to
  look into.  Thanks for any help/thoughts people can give.


 If I understand your question correctly, then you have two problems to
 address: 1) converting PDF, Word, etc. files into plain text, and 2)
 marking up the result (which is a bibliography) into structure data.
 Correct?

 If so, then if your PDF documents have already been OCRed, or if you have
 other files, then you can probably feed them to TIKA to quickly and easily
 extract the underlying plain text. [1] I wrote a brain-dead shell script to
 run TIKA in server mode and then convert Word (.docx) files. [2]

 When it comes to marking up the result into structured data, well, good
 luck. I think such an application is something Library Land sought for a
 long time. “Can you say Holy Grail?

 [1] Tika - https://tika.apache.org
 [2] brain-dead script -
 https://gist.github.com/ericleasemorgan/c4e34ffad96c0221f1ff

 —
 Eric



[CODE4LIB] Desiring Advice for Converting OCR Text into Metadata and/or a Database

2015-06-18 Thread Matt Sherman
Hi Code4Libbers,

I am working with colleague on a side project which involves some scanned
bibliographies and making them more web searchable/sortable/browse-able.
While I am quite familiar with the metadata and organization aspects we
need, but I am at a bit of a loss on how to automate the process of putting
the bibliography in a more structured format so that we can avoid going
through hundreds of pages by hand.  I am pretty sure regular expressions
are needed, but I have not had an instance where I need to automate
extracting data from one file type (PDF OCR or text extracted to Word doc)
and place it into another (either a database or an XML file) with some
enrichment.  I would appreciate any suggestions for approaches or tools to
look into.  Thanks for any help/thoughts people can give.

Matt Sherman


Re: [CODE4LIB] Desiring Advice for Converting OCR Text into Metadata and/or a Database

2015-06-18 Thread Matt Sherman
The hope is to take these bibliographies put it into more of a web
searchable/sortable format for researchers to make use out of them.  My
colleague was taking some inspiration from the Marlowe Bibliography (
https://marlowebibliography.org/), though we are hoping to possibly get a
bit more robust with the bibliography we are working on.  The important
first step it to be able to parse the existing OCRed bibliography scans we
have into a database, possibly a custom XML format but a database will
probably be easier to append and expand down the road.

On Thu, Jun 18, 2015 at 1:11 PM, Kyle Banerjee kyle.baner...@gmail.com
wrote:

 How you want to preprocess and structure the data depends on what you hope
 to achieve. Can you say more about what you want the end product to look
 like?

 kyle

 On Thu, Jun 18, 2015 at 10:08 AM, Matt Sherman matt.r.sher...@gmail.com
 wrote:

  That is a pretty good summation of it yes.  I appreciate the suggestions,
  this is a bit of a new realm for me and while I know what I want it to do
  and the structure I want to put it in, the conversion process has been
  eluding me so thanks for giving me some tools to look into.
 
  On Thu, Jun 18, 2015 at 1:04 PM, Eric Lease Morgan emor...@nd.edu
 wrote:
 
   On Jun 18, 2015, at 12:02 PM, Matt Sherman matt.r.sher...@gmail.com
   wrote:
  
I am working with colleague on a side project which involves some
  scanned
bibliographies and making them more web
  searchable/sortable/browse-able.
While I am quite familiar with the metadata and organization aspects
 we
need, but I am at a bit of a loss on how to automate the process of
   putting
the bibliography in a more structured format so that we can avoid
 going
through hundreds of pages by hand.  I am pretty sure regular
  expressions
are needed, but I have not had an instance where I need to automate
extracting data from one file type (PDF OCR or text extracted to Word
   doc)
and place it into another (either a database or an XML file) with
 some
enrichment.  I would appreciate any suggestions for approaches or
 tools
   to
look into.  Thanks for any help/thoughts people can give.
  
  
   If I understand your question correctly, then you have two problems to
   address: 1) converting PDF, Word, etc. files into plain text, and 2)
   marking up the result (which is a bibliography) into structure data.
   Correct?
  
   If so, then if your PDF documents have already been OCRed, or if you
 have
   other files, then you can probably feed them to TIKA to quickly and
  easily
   extract the underlying plain text. [1] I wrote a brain-dead shell
 script
  to
   run TIKA in server mode and then convert Word (.docx) files. [2]
  
   When it comes to marking up the result into structured data, well, good
   luck. I think such an application is something Library Land sought for
 a
   long time. “Can you say Holy Grail?
  
   [1] Tika - https://tika.apache.org
   [2] brain-dead script -
   https://gist.github.com/ericleasemorgan/c4e34ffad96c0221f1ff
  
   —
   Eric
  
 



Re: [CODE4LIB] Desiring Advice for Converting OCR Text into Metadata and/or a Database

2015-06-18 Thread Matt Sherman
Thanks, that is interesting since we can export from the PDFs, and while
the OCR text is a little messy it is in decent shape.  I'll certainly look
into that.

On Thu, Jun 18, 2015 at 3:13 PM, Gordon, Bonnie bgor...@rockarch.org
wrote:

 We¹re actually also working on getting a bibliography from a Word Doc to a
 more structured format. We¹re using regular expressions in LibreOffice
 Writer to mark up the citations, then insert tabs between the elements,
 and then copy into a spreadsheet (similar to what¹s described in
 http://programminghistorian.org/lessons/understanding-regular-expressions
 ).
  However, our bibliography was originally a Word Doc, not OCRed text. This
 method is pretty reliant on consistent formatting, though, so messy OCR
 could complicate things. Another thing to note is that it¹s easiest when
 you know what format the citation is for (e.g., a book or article), since
 that impacts how the citation is structured.  I¹d be happy to provide a
 sample citation in each step of the process.

 All the best,
 Bonnie



 On 6/18/15, 1:52 PM, Matt Sherman matt.r.sher...@gmail.com wrote:

 The hope is to take these bibliographies put it into more of a web
 searchable/sortable format for researchers to make use out of them.  My
 colleague was taking some inspiration from the Marlowe Bibliography (
 https://marlowebibliography.org/), though we are hoping to possibly get a
 bit more robust with the bibliography we are working on.  The important
 first step it to be able to parse the existing OCRed bibliography scans we
 have into a database, possibly a custom XML format but a database will
 probably be easier to append and expand down the road.
 
 On Thu, Jun 18, 2015 at 1:11 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:
 
  How you want to preprocess and structure the data depends on what you
 hope
  to achieve. Can you say more about what you want the end product to look
  like?
 
  kyle
 
  On Thu, Jun 18, 2015 at 10:08 AM, Matt Sherman
 matt.r.sher...@gmail.com
  wrote:
 
   That is a pretty good summation of it yes.  I appreciate the
 suggestions,
   this is a bit of a new realm for me and while I know what I want it
 to do
   and the structure I want to put it in, the conversion process has been
   eluding me so thanks for giving me some tools to look into.
  
   On Thu, Jun 18, 2015 at 1:04 PM, Eric Lease Morgan emor...@nd.edu
  wrote:
  
On Jun 18, 2015, at 12:02 PM, Matt Sherman
 matt.r.sher...@gmail.com
wrote:
   
 I am working with colleague on a side project which involves some
   scanned
 bibliographies and making them more web
   searchable/sortable/browse-able.
 While I am quite familiar with the metadata and organization
 aspects
  we
 need, but I am at a bit of a loss on how to automate the process
 of
putting
 the bibliography in a more structured format so that we can avoid
  going
 through hundreds of pages by hand.  I am pretty sure regular
   expressions
 are needed, but I have not had an instance where I need to
 automate
 extracting data from one file type (PDF OCR or text extracted to
 Word
doc)
 and place it into another (either a database or an XML file) with
  some
 enrichment.  I would appreciate any suggestions for approaches or
  tools
to
 look into.  Thanks for any help/thoughts people can give.
   
   
If I understand your question correctly, then you have two problems
 to
address: 1) converting PDF, Word, etc. files into plain text, and 2)
marking up the result (which is a bibliography) into structure data.
Correct?
   
If so, then if your PDF documents have already been OCRed, or if you
  have
other files, then you can probably feed them to TIKA to quickly and
   easily
extract the underlying plain text. [1] I wrote a brain-dead shell
  script
   to
run TIKA in server mode and then convert Word (.docx) files. [2]
   
When it comes to marking up the result into structured data, well,
 good
luck. I think such an application is something Library Land sought
 for
  a
long time. ³Can you say Holy Grail?
   
[1] Tika - https://tika.apache.org
[2] brain-dead script -
https://gist.github.com/ericleasemorgan/c4e34ffad96c0221f1ff
   
‹
Eric
   
  
 



Re: [CODE4LIB] XSLT Advice

2015-06-04 Thread Matt Sherman
Thanks for the help everyone, I've got it working great now.  Below is
the XSLT I ended up with and a few examples of the returns I get.  All
of the input was extremely helpful in figuring out the best solution.

XSLT Change:

!-- dc.publication fields to dc.identifier --
xsl:for-each 
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']
dc:identifierxsl:value-of
select=doc:element[@name='name']/doc:element/doc:field[@name='value']/
xsl:if
test=string(doc:element[@name='volume']/doc:element/doc:field[@name='value'])
xsl:text Vol. /xsl:text
xsl:value-of
select=doc:element[@name='volume']/doc:element/doc:field[@name='value']/
/xsl:if
xsl:if
test=string(doc:element[@name='issue']/doc:element/doc:field[@name='value'])
xsl:text Issue /xsl:text
xsl:value-of
select=doc:element[@name='issue']/doc:element/doc:field[@name='value']/
/xsl:if
/dc:identifier
/xsl:for-each


Output Examples:

dc:identifierAdvances in Computer Science and Engineering Vol. 1
Issue 1/dc:identifier
dc:identifierJournal of Intelligent and Robotic Systems Vol.
24/dc:identifier
dc:identifierVol. 2/dc:identifier

On Wed, Jun 3, 2015 at 3:15 PM, Matt Sherman matt.r.sher...@gmail.com wrote:
 Thanks for all the suggestions folks, other things at work have
 prevented me from working on this quite yet but there is a lot of
 helpful advice and suggestions here so thanks.  As a note on the
 xsl:for-each that is actually the default dc_oai.xsl file for DSpace
 and not my own work so feel free to ding the folks over there for that
 choice.  This gives me a better feel for how XSLT hands conditionals
 and some ideas on how to tackle it.  Though I would be curious to ask
 if there is a reason to default to xsl:if vs. xsl:choose?  I'm not
 sure that I necessarily need a switch to fix this problem but I want
 to hear the thought process so I know how to better think over these
 options in the future.

 On Tue, Jun 2, 2015 at 5:21 PM, Boheemen, Peter van
 peter.vanbohee...@wur.nl wrote:
 You should use a template that is only applied when the specified field is 
 there. These templates in xslt are applied automatically only if the field 
 is there:

 xsl:template 
 match=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']
   xsl:textVol. /xsl:text
  xsl:apply-templates/
 /xsl:template

 If the field is defined, but empty you should do:

 xsl:template 
 match=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']
   xsl:if test=not(.='')
  xsl:textVol. /xsl:text
  xsl:apply-templates/
   /xsl:if
 /xsl:template

 Xslt is not a procedural language, you should hardly ever use xsl:for-each

 Peter


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt 
 Sherman
 Sent: dinsdag 2 juni 2015 21:35
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] XSLT Advice

 Cool.  I talked to Ron via phone so I am getting a better picture, but I am 
 still happy to take more insights.

 So the larger context.  I inherited a DSpace instance with three custom 
 metadata fields which actually have some useful publication information, 
 though they improperly titled them in by associating them with a dc prefix 
 but there were two many to fix quickly and they haven't broken DSpace yet so 
 we continue.  So I added to the XSL to pull the data within the the custom 
 fields to display publication name Vol. publication volume Issue 
 publication issue.  That worked really well until I realized that there 
 was no conditional so even when the fields are empty I still get: 
 dc:identifierVol.
 Issue/dc:identifier

 So here are the Custom Metadata fields:

 dc.publication.issue
 dc.publication.name
 dc.publication.volume


 Here is the customized XSLT, with dc.identifier added for context of what 
 the rest of the sheet looks like.

 !-- dc.identifier --
 xsl:for-each
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:field[@name='value']
 dc:identifierxsl:value-of select=. //dc:identifier
 /xsl:for-each

 !-- dc.identifier.* --
 xsl:for-each 
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:element/doc:field[@name='value']
 dc:identifierxsl:value-of select=. //dc:identifier 
 /xsl:for-each

 !-- dc.publication fields to dc.identifier -- dc:identifierxsl:value-of 
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='name']/doc:element/doc:field[@name='value']/xsl:text
 Vol. /xsl:textxsl:value-of
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']/xsl:text
 Issue /xsl:textxsl:value

Re: [CODE4LIB] XSLT Advice

2015-06-03 Thread Matt Sherman
Thanks for all the suggestions folks, other things at work have
prevented me from working on this quite yet but there is a lot of
helpful advice and suggestions here so thanks.  As a note on the
xsl:for-each that is actually the default dc_oai.xsl file for DSpace
and not my own work so feel free to ding the folks over there for that
choice.  This gives me a better feel for how XSLT hands conditionals
and some ideas on how to tackle it.  Though I would be curious to ask
if there is a reason to default to xsl:if vs. xsl:choose?  I'm not
sure that I necessarily need a switch to fix this problem but I want
to hear the thought process so I know how to better think over these
options in the future.

On Tue, Jun 2, 2015 at 5:21 PM, Boheemen, Peter van
peter.vanbohee...@wur.nl wrote:
 You should use a template that is only applied when the specified field is 
 there. These templates in xslt are applied automatically only if the field is 
 there:

 xsl:template 
 match=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']
   xsl:textVol. /xsl:text
  xsl:apply-templates/
 /xsl:template

 If the field is defined, but empty you should do:

 xsl:template 
 match=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']
   xsl:if test=not(.='')
  xsl:textVol. /xsl:text
  xsl:apply-templates/
   /xsl:if
 /xsl:template

 Xslt is not a procedural language, you should hardly ever use xsl:for-each

 Peter


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt 
 Sherman
 Sent: dinsdag 2 juni 2015 21:35
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] XSLT Advice

 Cool.  I talked to Ron via phone so I am getting a better picture, but I am 
 still happy to take more insights.

 So the larger context.  I inherited a DSpace instance with three custom 
 metadata fields which actually have some useful publication information, 
 though they improperly titled them in by associating them with a dc prefix 
 but there were two many to fix quickly and they haven't broken DSpace yet so 
 we continue.  So I added to the XSL to pull the data within the the custom 
 fields to display publication name Vol. publication volume Issue 
 publication issue.  That worked really well until I realized that there was 
 no conditional so even when the fields are empty I still get: 
 dc:identifierVol.
 Issue/dc:identifier

 So here are the Custom Metadata fields:

 dc.publication.issue
 dc.publication.name
 dc.publication.volume


 Here is the customized XSLT, with dc.identifier added for context of what the 
 rest of the sheet looks like.

 !-- dc.identifier --
 xsl:for-each
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:field[@name='value']
 dc:identifierxsl:value-of select=. //dc:identifier
 /xsl:for-each

 !-- dc.identifier.* --
 xsl:for-each 
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:element/doc:field[@name='value']
 dc:identifierxsl:value-of select=. //dc:identifier 
 /xsl:for-each

 !-- dc.publication fields to dc.identifier -- dc:identifierxsl:value-of 
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='name']/doc:element/doc:field[@name='value']/xsl:text
 Vol. /xsl:textxsl:value-of
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']/xsl:text
 Issue /xsl:textxsl:value-of
 select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='issue']/doc:element/doc:field[@name='value']//dc:identifier


 Ron suggested that using choose and when and that does seem to make the most 
 sense.  The other trickiness is that I have found that some of these fields 
 as filled when others are blank, such as their being a volume but not an 
 issue.  So I need to figure out how to test multiple fields so that I can 
 have it display differently dependent on what has data or not at all none of 
 the fields are filled, which is the case in items such as posters.

 So any thoughts would help.  Thanks.

 On Tue, Jun 2, 2015 at 2:50 PM, Wick, Ryan ryan.w...@oregonstate.edu wrote:
 I agree with Stuart, post the example here.

 Or if you want more real-time chat there's always #code4lib IRC.

 For an XSLT resource, Dave Pawson's site is great:
 http://www.dpawson.co.uk/xsl/sect2/sect21.html

 Ryan Wick

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of Stuart A. Yeates
 Sent: Tuesday, June 02, 2015 11:46 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] XSLT Advice

 There are a number of experienced xslt'ers here. Post your example to the 
 group so we can all learn

[CODE4LIB] XSLT Advice

2015-06-02 Thread Matt Sherman
Hi all,

I am making a few corrections on an oai_dc.xslt file for our DSpace
instance I slightly botched modifying to integrate some custom
metadata into a dc.identifier citation in the OAI-PMH harvest.  I need
to get proper conditionals so it can display and harvest the metadata
correctly and not run when there is no data in those fields.  I have a
pretty good idea what I need to do, and if this were like JavaScript
or Python I could probably muddle through.  The trouble is that I
don't know the conditional syntax for XSLT quite well enough to know
what I can do and thus need to do.  Plus the online resources for
learning/referencing XSLT for this are a bit shallow for what I need
hence asking the group.  So if there is anyone who knows XSLT really
well that would be willing to talk with me for a bit to help me work
through what I need to get the syntax to work like I want I would
appreciate it.  Thanks.

Matt Sherman


Re: [CODE4LIB] XSLT Advice

2015-06-02 Thread Matt Sherman
Cool.  I talked to Ron via phone so I am getting a better picture, but
I am still happy to take more insights.

So the larger context.  I inherited a DSpace instance with three
custom metadata fields which actually have some useful publication
information, though they improperly titled them in by associating them
with a dc prefix but there were two many to fix quickly and they
haven't broken DSpace yet so we continue.  So I added to the XSL to
pull the data within the the custom fields to display publication
name Vol. publication volume Issue publication issue.  That
worked really well until I realized that there was no conditional so
even when the fields are empty I still get: dc:identifierVol.
Issue/dc:identifier

So here are the Custom Metadata fields:

dc.publication.issue
dc.publication.name
dc.publication.volume


Here is the customized XSLT, with dc.identifier added for context of
what the rest of the sheet looks like.

!-- dc.identifier --
xsl:for-each
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:field[@name='value']
dc:identifierxsl:value-of select=. //dc:identifier
/xsl:for-each

!-- dc.identifier.* --
xsl:for-each 
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='identifier']/doc:element/doc:element/doc:field[@name='value']
dc:identifierxsl:value-of select=. //dc:identifier
/xsl:for-each

!-- dc.publication fields to dc.identifier --
dc:identifierxsl:value-of
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='name']/doc:element/doc:field[@name='value']/xsl:text
Vol. /xsl:textxsl:value-of
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='volume']/doc:element/doc:field[@name='value']/xsl:text
Issue /xsl:textxsl:value-of
select=doc:metadata/doc:element[@name='dc']/doc:element[@name='publication']/doc:element[@name='issue']/doc:element/doc:field[@name='value']//dc:identifier


Ron suggested that using choose and when and that does seem to make
the most sense.  The other trickiness is that I have found that some
of these fields as filled when others are blank, such as their being a
volume but not an issue.  So I need to figure out how to test multiple
fields so that I can have it display differently dependent on what has
data or not at all none of the fields are filled, which is the case in
items such as posters.

So any thoughts would help.  Thanks.

On Tue, Jun 2, 2015 at 2:50 PM, Wick, Ryan ryan.w...@oregonstate.edu wrote:
 I agree with Stuart, post the example here.

 Or if you want more real-time chat there's always #code4lib IRC.

 For an XSLT resource, Dave Pawson's site is great: 
 http://www.dpawson.co.uk/xsl/sect2/sect21.html

 Ryan Wick

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
 Stuart A. Yeates
 Sent: Tuesday, June 02, 2015 11:46 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] XSLT Advice

 There are a number of experienced xslt'ers here. Post your example to the 
 group so we can all learn.

 Cheers
 Stuart

 On Wednesday, June 3, 2015, Matt Sherman matt.r.sher...@gmail.com wrote:

 Hi all,

 I am making a few corrections on an oai_dc.xslt file for our DSpace
 instance I slightly botched modifying to integrate some custom
 metadata into a dc.identifier citation in the OAI-PMH harvest.  I need
 to get proper conditionals so it can display and harvest the metadata
 correctly and not run when there is no data in those fields.  I have a
 pretty good idea what I need to do, and if this were like JavaScript
 or Python I could probably muddle through.  The trouble is that I
 don't know the conditional syntax for XSLT quite well enough to know
 what I can do and thus need to do.  Plus the online resources for
 learning/referencing XSLT for this are a bit shallow for what I need
 hence asking the group.  So if there is anyone who knows XSLT really
 well that would be willing to talk with me for a bit to help me work
 through what I need to get the syntax to work like I want I would
 appreciate it.  Thanks.

 Matt Sherman



 --
 --
 ...let us be heard from red core to black sky


Re: [CODE4LIB] Mac OS 9 emulator

2015-04-22 Thread Matt Sherman
Why would you not just run an instance in Virtual Box?

On Wed, Apr 22, 2015 at 2:50 PM, Schmitz Fuhrig, Lynda 
schmitzfuhr...@si.edu wrote:

 Hello all,
 Can anyone recommend a Mac OS 9 emulator that can run off 10.6.x machine
 or later?

 Thanks,
 Lynda

 Lynda Schmitz Fuhrig
 Electronic Records Archivist
 Digital Services Division
 Smithsonian Institution Archives
 Capital Gallery Building
 600 Maryland Ave SW
 Suite 3000
 MRC 507
 Washington, DC 20024-2520

 siarchives.si.eduhttp://siarchives.si.edu/ | @SmithsonianArch
 https://twitter.com/smithsonianarch | Facebook
 https://www.facebook.com/SmithsonianInstitutionArchives | e-newsletter
 http://visitor.r20.constantcontact.com/manage/optin/ea?v=0010Oqxbncv4WpyheEee3Q9DHdF_192SxMMIWgsXuMG1qJ5yKPErzu0TI5d4qyMxK4iLMccSoQG5ck%3D
 

 A gift
 http://siarchives.si.edu/about/donate-smithsonian-institution-archives
 in support of the Archives will help make more of our collections
 accessible!



Re: [CODE4LIB] talking about digital collections vs electronic resources

2015-03-18 Thread Matt Sherman
I haven't done any testing on that, but your understanding it the
conventional on in the field.

On Wed, Mar 18, 2015 at 12:22 PM, Derek Merleaux derek.merle...@gmail.com
wrote:

 I've always been inclined to use digital collections to talk about a
 collection of things that have been digitized or perhaps including born
 digital things that are part of a collection in an archival sort of way.
 I prefer the term electronic resources for the databases and other
 things...
 -Derek

 On Wed, Mar 18, 2015 at 12:04 PM, Jenn C jen...@gmail.com wrote:

  Hi-
 
  We're having a discussion about some web site labeling and navigation. We
  have a list of digital collections which are collections that contain
  items we've digitized. There was concern expressed that we have something
  labeled digital collections patrons might think that includes databases
  and other items.
 
  Has anyone done user testing around this or have any experience/ideas
 about
  how to handle the difference between these?
 
  Thanks!
  jenn
 



Re: [CODE4LIB] Code4Lib NE

2015-03-14 Thread Matt Sherman
As a quick follow up, there is now more information on the wiki:

http://wiki.code4lib.org/NECode4lib_2015_Home

We still have a few mechanics to work out but if people are interested in
coming and speaking on something you can sign up to talk on the wiki page.
We hope to have the registration and other things up soon as well.

Matt Sherman

On Wed, Mar 11, 2015 at 10:42 AM, Matt Bernhardt matt.j.bernha...@gmail.com
 wrote:

 To add to Matt's comment - that date and venue are now confirmed:

 Friday, May 29, 2015
 MIT Campus

 Other details are coming - everything will be posted to the code4lib wiki,
 an the URL announced on this list and elsewhere.

 The planning group is coordinating our activities in a Google Group, which
 you can join here:
 https://groups.google.com/forum/#!forum/code4lib-ne

 Thanks,
 Matt Bernhardt



 On Wed, Mar 11, 2015 at 10:29 AM, Matthew Sherman 
 matt.r.sher...@gmail.com
 wrote:

  We just were discussing it this morning.  We will have more details on
  the wiki soon.
 
  On Wed, Mar 11, 2015 at 10:28 AM, Whitni Watkins
  whitni.watk...@gmail.com wrote:
   Wiki Update: A tentative date of Friday May 29, 2015 on the MIT campus
  in Cambridge, MA has been discussed. More details will be provided in
  February.
  
   Is there any further confirming information on Code4Lib NE this May?
  
   Thanks!
   Whitni Watkins