Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
David,

That sounds like a definite option. Thanks. Does S3 has an API for uploading so 
that the upload process could be scripted, or do you manually upload each file?

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
ddwigg...@historicnewengland.org
Sent: Thursday, January 10, 2013 4:29 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

We built our own solution for this by creating a plugin that works with our 
digital asset management system (ResourceSpace) to invidually back up files to 
Amazon S3. Because S3 is replicated to multiple data centers, this provides a 
fairly high level of redundancy. And because it's an object-based web service, 
we can access any given object individually by using a URL related to the 
original storage URL within our system.
 
This also allows us to take advantage of S3 for images on our website. All of 
the images from in our online collections database are being served straight 
from S3, which diverts the load from our public web server. When we launch 
zoomable images later this year, all of the tiles will also be generated 
locally in the DAM and then served to the public via the mirrored copy in S3.
 
The current pricing is around $0.08/GB/month for 1-50 TB, which I think is 
fairly reasonable for what we're getting. They just dropped the price 
substantially a few months ago.
 
DuraCloud http://www.duracloud.org/ supposedly offers a way to add another 
abstraction layer so you can build something like this that is portable between 
different cloud storage providers. But I haven't really looked into this as of 
yet.
 
-David

 
__
 
David Dwiggins
Systems Librarian/Archivist, Historic New England
141 Cambridge Street, Boston, MA 02114
(617) 994-5948
ddwigg...@historicnewengland.org
http://www.historicnewengland.org
 Joshua Welker jwel...@sbuniv.edu 1/10/2013 5:20 PM 
Hi everyone,

We are starting a digitization project for some of our special collections, and 
we are having a hard time setting up a backup system that meets the long-term 
preservation needs of digital archives. The backup mechanisms currently used by 
campus IT are short-term full-server backups. What we are looking for is more 
granular, file-level backup over the very long term. Does anyone have any 
recommendations of software or some service or technique? We are looking into 
LOCKSS but haven't dug too deeply yet. Can anyone who uses LOCKSS tell me a bit 
of their experiences with it?

Josh Welker
Electronic/Media Services Librarian
College Liaison
University Libraries
Southwest Baptist University
417.328.1624


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Gary McGath
Concerns have been raised about how expensive Glacier gets if you need
to recover a lot of files in a short time period.

http://www.wired.com/wiredenterprise/2012/08/glacier/

On 1/10/13 5:56 PM, Roy Tennant wrote:
 I'd also take a look at Amazon Glacier. Recently I parked about 50GB
 of data files in logical tar'd and gzip'd chunks and it's costing my
 employer less than 50 cents/month. Glacier, however, is best for park
 it and forget kinds of needs, as the real cost is in data flow.
 Storage is cheap, but must be considered offline or near line as
 you must first request to retrieve a file, wait for about a day, and
 then retrieve the file. And you're charged more for the download
 throughput than just about anything.
 
 I'm using a Unix client to handle all of the heavy lifting of
 uploading and downloading, as Glacier is meant to be used via an API
 rather than a web client.[1] If anyone is interested, I have local
 documentation on usage that I could probably genericize. And yes, I
 did round-trip a file to make sure it functioned as advertised.
 Roy
 
 [1] https://github.com/vsespb/mt-aws-glacier
 
 On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org wrote:
 We built our own solution for this by creating a plugin that works with our 
 digital asset management system (ResourceSpace) to invidually back up files 
 to Amazon S3. Because S3 is replicated to multiple data centers, this 
 provides a fairly high level of redundancy. And because it's an object-based 
 web service, we can access any given object individually by using a URL 
 related to the original storage URL within our system.

 This also allows us to take advantage of S3 for images on our website. All 
 of the images from in our online collections database are being served 
 straight from S3, which diverts the load from our public web server. When we 
 launch zoomable images later this year, all of the tiles will also be 
 generated locally in the DAM and then served to the public via the mirrored 
 copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I think is 
 fairly reasonable for what we're getting. They just dropped the price 
 substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add another 
 abstraction layer so you can build something like this that is portable 
 between different cloud storage providers. But I haven't really looked into 
 this as of yet.


-- 
Gary McGath, Professional Software Developer
http://www.garymcgath.com


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Glacier sounds even better than S3 for what we're looking for. We are only 
going to be retrieving the files in the case of corruption, so the 
pay-per-retrieval model would work well. I heard of Glacier in the past but 
forgot all about it. Thank you.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy 
Tennant
Sent: Thursday, January 10, 2013 4:56 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

I'd also take a look at Amazon Glacier. Recently I parked about 50GB of data 
files in logical tar'd and gzip'd chunks and it's costing my employer less than 
50 cents/month. Glacier, however, is best for park it and forget kinds of 
needs, as the real cost is in data flow.
Storage is cheap, but must be considered offline or near line as you must 
first request to retrieve a file, wait for about a day, and then retrieve the 
file. And you're charged more for the download throughput than just about 
anything.

I'm using a Unix client to handle all of the heavy lifting of uploading and 
downloading, as Glacier is meant to be used via an API rather than a web 
client.[1] If anyone is interested, I have local documentation on usage that I 
could probably genericize. And yes, I did round-trip a file to make sure it 
functioned as advertised.
Roy

[1] https://github.com/vsespb/mt-aws-glacier

On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org wrote:
 We built our own solution for this by creating a plugin that works with our 
 digital asset management system (ResourceSpace) to invidually back up files 
 to Amazon S3. Because S3 is replicated to multiple data centers, this 
 provides a fairly high level of redundancy. And because it's an object-based 
 web service, we can access any given object individually by using a URL 
 related to the original storage URL within our system.

 This also allows us to take advantage of S3 for images on our website. All of 
 the images from in our online collections database are being served straight 
 from S3, which diverts the load from our public web server. When we launch 
 zoomable images later this year, all of the tiles will also be generated 
 locally in the DAM and then served to the public via the mirrored copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I think is 
 fairly reasonable for what we're getting. They just dropped the price 
 substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add another 
 abstraction layer so you can build something like this that is portable 
 between different cloud storage providers. But I haven't really looked into 
 this as of yet.

 -David


 __

 David Dwiggins
 Systems Librarian/Archivist, Historic New England
 141 Cambridge Street, Boston, MA 02114
 (617) 994-5948
 ddwigg...@historicnewengland.org
 http://www.historicnewengland.org
 Joshua Welker jwel...@sbuniv.edu 1/10/2013 5:20 PM 
 Hi everyone,

 We are starting a digitization project for some of our special collections, 
 and we are having a hard time setting up a backup system that meets the 
 long-term preservation needs of digital archives. The backup mechanisms 
 currently used by campus IT are short-term full-server backups. What we are 
 looking for is more granular, file-level backup over the very long term. Does 
 anyone have any recommendations of software or some service or technique? We 
 are looking into LOCKSS but haven't dug too deeply yet. Can anyone who uses 
 LOCKSS tell me a bit of their experiences with it?

 Josh Welker
 Electronic/Media Services Librarian
 College Liaison
 University Libraries
 Southwest Baptist University
 417.328.1624


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Good point. But since campus IT will be creating regular disaster-recovery 
backups, the odds that we'd need ever need to retrieve more than a handful of 
files from Glacier at a time is pretty low. 

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Gary 
McGath
Sent: Friday, January 11, 2013 8:03 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Concerns have been raised about how expensive Glacier gets if you need to 
recover a lot of files in a short time period.

http://www.wired.com/wiredenterprise/2012/08/glacier/

On 1/10/13 5:56 PM, Roy Tennant wrote:
 I'd also take a look at Amazon Glacier. Recently I parked about 50GB 
 of data files in logical tar'd and gzip'd chunks and it's costing my 
 employer less than 50 cents/month. Glacier, however, is best for park 
 it and forget kinds of needs, as the real cost is in data flow.
 Storage is cheap, but must be considered offline or near line as 
 you must first request to retrieve a file, wait for about a day, and 
 then retrieve the file. And you're charged more for the download 
 throughput than just about anything.
 
 I'm using a Unix client to handle all of the heavy lifting of 
 uploading and downloading, as Glacier is meant to be used via an API 
 rather than a web client.[1] If anyone is interested, I have local 
 documentation on usage that I could probably genericize. And yes, I 
 did round-trip a file to make sure it functioned as advertised.
 Roy
 
 [1] https://github.com/vsespb/mt-aws-glacier
 
 On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org wrote:
 We built our own solution for this by creating a plugin that works with our 
 digital asset management system (ResourceSpace) to invidually back up files 
 to Amazon S3. Because S3 is replicated to multiple data centers, this 
 provides a fairly high level of redundancy. And because it's an object-based 
 web service, we can access any given object individually by using a URL 
 related to the original storage URL within our system.

 This also allows us to take advantage of S3 for images on our website. All 
 of the images from in our online collections database are being served 
 straight from S3, which diverts the load from our public web server. When we 
 launch zoomable images later this year, all of the tiles will also be 
 generated locally in the DAM and then served to the public via the mirrored 
 copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I think is 
 fairly reasonable for what we're getting. They just dropped the price 
 substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add another 
 abstraction layer so you can build something like this that is portable 
 between different cloud storage providers. But I haven't really looked into 
 this as of yet.


--
Gary McGath, Professional Software Developer http://www.garymcgath.com


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Matt Schultz
Hi Josh,

Glad you are looking into LOCKSS as a potential solution for your needs and
that you are thinking beyond simple backup solutions for more long-term
preservation. Here at MetaArchive Cooperative we make use of LOCKSS to
preserve a range of content/collections from our member institutions.

The nice thing (I think) about our approach and our use of LOCKSS as an
embedded technology is that you as an institution retain full control over
your collections in the preservation network and get to play an active and
on-going part in their preservation treatment over time. Storage costs in
MetaArchive are competitive ($1/GB/year), and with that you get up to 7
geographic replications. MetaArchive is international at this point and so
your collections really do achieve some safe distance from any disasters
that may hit close to home.

I'd be more than happy to talk with you further about your collection
needs, why we like LOCKSS, and any interest your institution may have in
being part of a collaborative approach to preserving your content above and
beyond simple backup. Feel free to contact me directly.

Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative
http://www.metaarchive.org
matt.schu...@metaarchive.org
616-566-3204

On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

 Hi everyone,

 We are starting a digitization project for some of our special
 collections, and we are having a hard time setting up a backup system that
 meets the long-term preservation needs of digital archives. The backup
 mechanisms currently used by campus IT are short-term full-server backups.
 What we are looking for is more granular, file-level backup over the very
 long term. Does anyone have any recommendations of software or some service
 or technique? We are looking into LOCKSS but haven't dug too deeply yet.
 Can anyone who uses LOCKSS tell me a bit of their experiences with it?

 Josh Welker
 Electronic/Media Services Librarian
 College Liaison
 University Libraries
 Southwest Baptist University
 417.328.1624




-- 
Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative
http://www.metaarchive.org
matt.schu...@metaarchive.org
616-566-3204


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Al Matthews
We use LOCKSS as part of MetaArchive. LOCKSS as I understand it is
typically spec-d for consumer hardware, and so, presumably as a result of
SE Asia flooding, there have been some drive failures and cache downtimes
and adjustments accordingly.

However, that is the worst of it, first.

LOCKSS is to some perhaps even considerable degree, tamper-resistant since
it relies on mechanisms of collective polling among multiple copies to
preserve integrity. This, as opposed to static checksums or some other
solution.

As such, it seems to me important to run a LOCKSS box with other LOCKSS
boxes; MA cooperative specifies six or so, distributed locations for each
cache.

The economic sustainability of such an enterprise is a valid question.
David S H Rosenthal at Stanford seems to lead the charge for this research.

e.g. http://blog.dshr.org/2012/08/amazons-announcement-of-glacier.html#more

I've heard mention from other players that they watch MA carefully for
such sustainability considerations, especially because MA uses LOCKSS for
non-journal content. In some sense this may extend LOCKSS beyond its
original design.

MetaArchive has in my opinion been extremely responsible in designating
succession scenarios and disaster recovery scenarios, going to far as to
fund, develop and test services for migration out of the system, into an
IRODS repository in the initial case.


Al Matthews
AUC Robert W. Woodruff Library

On 1/11/13 9:10 AM, Joshua Welker jwel...@sbuniv.edu wrote:

Good point. But since campus IT will be creating regular
disaster-recovery backups, the odds that we'd need ever need to retrieve
more than a handful of files from Glacier at a time is pretty low.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Gary McGath
Sent: Friday, January 11, 2013 8:03 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Concerns have been raised about how expensive Glacier gets if you need to
recover a lot of files in a short time period.

http://www.wired.com/wiredenterprise/2012/08/glacier/

On 1/10/13 5:56 PM, Roy Tennant wrote:
 I'd also take a look at Amazon Glacier. Recently I parked about 50GB
 of data files in logical tar'd and gzip'd chunks and it's costing my
 employer less than 50 cents/month. Glacier, however, is best for park
 it and forget kinds of needs, as the real cost is in data flow.
 Storage is cheap, but must be considered offline or near line as
 you must first request to retrieve a file, wait for about a day, and
 then retrieve the file. And you're charged more for the download
 throughput than just about anything.

 I'm using a Unix client to handle all of the heavy lifting of
 uploading and downloading, as Glacier is meant to be used via an API
 rather than a web client.[1] If anyone is interested, I have local
 documentation on usage that I could probably genericize. And yes, I
 did round-trip a file to make sure it functioned as advertised.
 Roy

 [1] https://github.com/vsespb/mt-aws-glacier

 On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org
wrote:
 We built our own solution for this by creating a plugin that works
with our digital asset management system (ResourceSpace) to invidually
back up files to Amazon S3. Because S3 is replicated to multiple data
centers, this provides a fairly high level of redundancy. And because
it's an object-based web service, we can access any given object
individually by using a URL related to the original storage URL within
our system.

 This also allows us to take advantage of S3 for images on our website.
All of the images from in our online collections database are being
served straight from S3, which diverts the load from our public web
server. When we launch zoomable images later this year, all of the
tiles will also be generated locally in the DAM and then served to the
public via the mirrored copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I
think is fairly reasonable for what we're getting. They just dropped
the price substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add
another abstraction layer so you can build something like this that is
portable between different cloud storage providers. But I haven't
really looked into this as of yet.


--
Gary McGath, Professional Software Developer http://www.garymcgath.com


-
**
The contents of this email and any attachments are confidential.
They are intended for the named recipient(s) only.
If you have received this email in error please notify the system
manager or  the 
sender immediately and do not disclose the contents to anyone or
make copies.

** IronMail scanned this email for viruses, vandals and malicious
content. **

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Thanks, Al. I think we'd join a LOCKSS network rather than run multiple LOCKSS 
boxes ourselves. Does anyone have any experience with one of those, like the 
LOCKSS Global Alliance?

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Al 
Matthews
Sent: Friday, January 11, 2013 8:50 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

We use LOCKSS as part of MetaArchive. LOCKSS as I understand it is typically 
spec-d for consumer hardware, and so, presumably as a result of SE Asia 
flooding, there have been some drive failures and cache downtimes and 
adjustments accordingly.

However, that is the worst of it, first.

LOCKSS is to some perhaps even considerable degree, tamper-resistant since it 
relies on mechanisms of collective polling among multiple copies to preserve 
integrity. This, as opposed to static checksums or some other solution.

As such, it seems to me important to run a LOCKSS box with other LOCKSS boxes; 
MA cooperative specifies six or so, distributed locations for each cache.

The economic sustainability of such an enterprise is a valid question.
David S H Rosenthal at Stanford seems to lead the charge for this research.

e.g. http://blog.dshr.org/2012/08/amazons-announcement-of-glacier.html#more

I've heard mention from other players that they watch MA carefully for such 
sustainability considerations, especially because MA uses LOCKSS for 
non-journal content. In some sense this may extend LOCKSS beyond its original 
design.

MetaArchive has in my opinion been extremely responsible in designating 
succession scenarios and disaster recovery scenarios, going to far as to fund, 
develop and test services for migration out of the system, into an IRODS 
repository in the initial case.


Al Matthews
AUC Robert W. Woodruff Library

On 1/11/13 9:10 AM, Joshua Welker jwel...@sbuniv.edu wrote:

Good point. But since campus IT will be creating regular 
disaster-recovery backups, the odds that we'd need ever need to 
retrieve more than a handful of files from Glacier at a time is pretty low.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
Gary McGath
Sent: Friday, January 11, 2013 8:03 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Concerns have been raised about how expensive Glacier gets if you need 
to recover a lot of files in a short time period.

http://www.wired.com/wiredenterprise/2012/08/glacier/

On 1/10/13 5:56 PM, Roy Tennant wrote:
 I'd also take a look at Amazon Glacier. Recently I parked about 50GB 
 of data files in logical tar'd and gzip'd chunks and it's costing my 
 employer less than 50 cents/month. Glacier, however, is best for 
 park it and forget kinds of needs, as the real cost is in data flow.
 Storage is cheap, but must be considered offline or near line as 
 you must first request to retrieve a file, wait for about a day, and 
 then retrieve the file. And you're charged more for the download 
 throughput than just about anything.

 I'm using a Unix client to handle all of the heavy lifting of 
 uploading and downloading, as Glacier is meant to be used via an API 
 rather than a web client.[1] If anyone is interested, I have local 
 documentation on usage that I could probably genericize. And yes, I 
 did round-trip a file to make sure it functioned as advertised.
 Roy

 [1] https://github.com/vsespb/mt-aws-glacier

 On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org
wrote:
 We built our own solution for this by creating a plugin that works 
with our digital asset management system (ResourceSpace) to 
invidually back up files to Amazon S3. Because S3 is replicated to 
multiple data centers, this provides a fairly high level of 
redundancy. And because it's an object-based web service, we can 
access any given object individually by using a URL related to the 
original storage URL within our system.

 This also allows us to take advantage of S3 for images on our website.
All of the images from in our online collections database are being 
served straight from S3, which diverts the load from our public web 
server. When we launch zoomable images later this year, all of the 
tiles will also be generated locally in the DAM and then served to 
the public via the mirrored copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I 
think is fairly reasonable for what we're getting. They just dropped 
the price substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add 
another abstraction layer so you can build something like this that 
is portable between different cloud storage providers. But I haven't 
really looked into this as of yet.


--
Gary McGath, Professional Software Developer http://www.garymcgath.com


-

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Al Matthews
http://metaarchive.org/costs in our case. Interested to hear other
experiences. Al


On 1/11/13 10:01 AM, Joshua Welker jwel...@sbuniv.edu wrote:

Thanks, Al. I think we'd join a LOCKSS network rather than run multiple
LOCKSS boxes ourselves. Does anyone have any experience with one of
those, like the LOCKSS Global Alliance?

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Al Matthews
Sent: Friday, January 11, 2013 8:50 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

We use LOCKSS as part of MetaArchive. LOCKSS as I understand it is
typically spec-d for consumer hardware, and so, presumably as a result of
SE Asia flooding, there have been some drive failures and cache downtimes
and adjustments accordingly.

However, that is the worst of it, first.

LOCKSS is to some perhaps even considerable degree, tamper-resistant
since it relies on mechanisms of collective polling among multiple copies
to preserve integrity. This, as opposed to static checksums or some other
solution.

As such, it seems to me important to run a LOCKSS box with other LOCKSS
boxes; MA cooperative specifies six or so, distributed locations for each
cache.

The economic sustainability of such an enterprise is a valid question.
David S H Rosenthal at Stanford seems to lead the charge for this
research.

e.g.
http://blog.dshr.org/2012/08/amazons-announcement-of-glacier.html#more

I've heard mention from other players that they watch MA carefully for
such sustainability considerations, especially because MA uses LOCKSS for
non-journal content. In some sense this may extend LOCKSS beyond its
original design.

MetaArchive has in my opinion been extremely responsible in designating
succession scenarios and disaster recovery scenarios, going to far as to
fund, develop and test services for migration out of the system, into an
IRODS repository in the initial case.


Al Matthews
AUC Robert W. Woodruff Library

On 1/11/13 9:10 AM, Joshua Welker jwel...@sbuniv.edu wrote:

Good point. But since campus IT will be creating regular
disaster-recovery backups, the odds that we'd need ever need to
retrieve more than a handful of files from Glacier at a time is pretty
low.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Gary McGath
Sent: Friday, January 11, 2013 8:03 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Concerns have been raised about how expensive Glacier gets if you need
to recover a lot of files in a short time period.

http://www.wired.com/wiredenterprise/2012/08/glacier/

On 1/10/13 5:56 PM, Roy Tennant wrote:
 I'd also take a look at Amazon Glacier. Recently I parked about 50GB
 of data files in logical tar'd and gzip'd chunks and it's costing my
 employer less than 50 cents/month. Glacier, however, is best for
 park it and forget kinds of needs, as the real cost is in data flow.
 Storage is cheap, but must be considered offline or near line as
 you must first request to retrieve a file, wait for about a day, and
 then retrieve the file. And you're charged more for the download
 throughput than just about anything.

 I'm using a Unix client to handle all of the heavy lifting of
 uploading and downloading, as Glacier is meant to be used via an API
 rather than a web client.[1] If anyone is interested, I have local
 documentation on usage that I could probably genericize. And yes, I
 did round-trip a file to make sure it functioned as advertised.
 Roy

 [1] https://github.com/vsespb/mt-aws-glacier

 On Thu, Jan 10, 2013 at 2:29 PM,  ddwigg...@historicnewengland.org
wrote:
 We built our own solution for this by creating a plugin that works
with our digital asset management system (ResourceSpace) to
invidually back up files to Amazon S3. Because S3 is replicated to
multiple data centers, this provides a fairly high level of
redundancy. And because it's an object-based web service, we can
access any given object individually by using a URL related to the
original storage URL within our system.

 This also allows us to take advantage of S3 for images on our website.
All of the images from in our online collections database are being
served straight from S3, which diverts the load from our public web
server. When we launch zoomable images later this year, all of the
tiles will also be generated locally in the DAM and then served to
the public via the mirrored copy in S3.

 The current pricing is around $0.08/GB/month for 1-50 TB, which I
think is fairly reasonable for what we're getting. They just dropped
the price substantially a few months ago.

 DuraCloud http://www.duracloud.org/ supposedly offers a way to add
another abstraction layer so you can build something like this that
is portable between different cloud storage providers. But I haven't
really looked into this as of yet.


--
Gary McGath, Professional Software Developer 

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread James Gilbert
Hi Josh,

I lurked on this thread, as I did not know the size of your institution.

Being a public library serving about 24,000 residents - we have the
small-institution issues as well for this type of project. We recently
tackled a similar situation and the solution:

1) Purchase a 3TB SeaGate external network storage device (residential drive
from Best Buy)
2) Burn archived materials to DVD
3) Copy files to external storage (on site in my server room)
4) DVDs reside off-site (we are still determining where this would be, as
the library does not have a Safe Deposit Box)

This removes external companies, and the data is quick trip home and back.

I know it is not elaborate and fancy, very little code... but it was $150
for the drive; and cost of DVDs. 

James Gilbert, BS, MLIS
Systems Librarian
Whitehall Township Public Library
3700 Mechanicsville Road
Whitehall, PA 18052
 
610-432-4330 ext: 203


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Joshua Welker
Sent: Friday, January 11, 2013 10:09 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Matt,

I appreciate the information. At that price, it looks like MetaArchive would
be a better option than most of the other services mentioned in this thread.
At this point, I think it is going to come down to a LOCKSS solution such as
what MetaArchive provides or Amazon Glacier. We anticipate our digital
collection growing to about 3TB in the first two years. With Glacier, that
would be $368 per year vs $3,072 per year for MetaArchive and LOCKSS. As
much as I would like to support library initiatives like LOCKSS, we are a
small institution with a very small budget, and the pricing of Glacier is
starting to look too good to pass up.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt
Schultz
Sent: Friday, January 11, 2013 8:49 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi Josh,

Glad you are looking into LOCKSS as a potential solution for your needs and
that you are thinking beyond simple backup solutions for more long-term
preservation. Here at MetaArchive Cooperative we make use of LOCKSS to
preserve a range of content/collections from our member institutions.

The nice thing (I think) about our approach and our use of LOCKSS as an
embedded technology is that you as an institution retain full control over
your collections in the preservation network and get to play an active and
on-going part in their preservation treatment over time. Storage costs in
MetaArchive are competitive ($1/GB/year), and with that you get up to 7
geographic replications. MetaArchive is international at this point and so
your collections really do achieve some safe distance from any disasters
that may hit close to home.

I'd be more than happy to talk with you further about your collection needs,
why we like LOCKSS, and any interest your institution may have in being part
of a collaborative approach to preserving your content above and beyond
simple backup. Feel free to contact me directly.

Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
matt.schu...@metaarchive.org
616-566-3204

On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

 Hi everyone,

 We are starting a digitization project for some of our special 
 collections, and we are having a hard time setting up a backup system 
 that meets the long-term preservation needs of digital archives. The 
 backup mechanisms currently used by campus IT are short-term full-server
backups.
 What we are looking for is more granular, file-level backup over the 
 very long term. Does anyone have any recommendations of software or 
 some service or technique? We are looking into LOCKSS but haven't dug too
deeply yet.
 Can anyone who uses LOCKSS tell me a bit of their experiences with it?

 Josh Welker
 Electronic/Media Services Librarian
 College Liaison
 University Libraries
 Southwest Baptist University
 417.328.1624




--
Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
matt.schu...@metaarchive.org
616-566-3204


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
James,

Definitely a simple and elegant solution, but that is not a viable long-term 
option for us. We currently have tons of old CDs and DVDs full of data, and one 
of our goals is to wean off those media completely.  Most consumer-grade CDs 
and DVDs are very poor in terms of long-term data integrity. Those discs have a 
shelf life of probably a decade or two tops. Plus we are wanting more 
redundancy than what is offered by having the backups as a collection of discs 
in a single physical location. But if that works for you guys, power to you. 
Cheap is good.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of James 
Gilbert
Sent: Friday, January 11, 2013 9:34 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi Josh,

I lurked on this thread, as I did not know the size of your institution.

Being a public library serving about 24,000 residents - we have the 
small-institution issues as well for this type of project. We recently tackled 
a similar situation and the solution:

1) Purchase a 3TB SeaGate external network storage device (residential drive 
from Best Buy)
2) Burn archived materials to DVD
3) Copy files to external storage (on site in my server room)
4) DVDs reside off-site (we are still determining where this would be, as the 
library does not have a Safe Deposit Box)

This removes external companies, and the data is quick trip home and back.

I know it is not elaborate and fancy, very little code... but it was $150 for 
the drive; and cost of DVDs. 

James Gilbert, BS, MLIS
Systems Librarian
Whitehall Township Public Library
3700 Mechanicsville Road
Whitehall, PA 18052
 
610-432-4330 ext: 203


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joshua 
Welker
Sent: Friday, January 11, 2013 10:09 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Matt,

I appreciate the information. At that price, it looks like MetaArchive would be 
a better option than most of the other services mentioned in this thread.
At this point, I think it is going to come down to a LOCKSS solution such as 
what MetaArchive provides or Amazon Glacier. We anticipate our digital 
collection growing to about 3TB in the first two years. With Glacier, that 
would be $368 per year vs $3,072 per year for MetaArchive and LOCKSS. As much 
as I would like to support library initiatives like LOCKSS, we are a small 
institution with a very small budget, and the pricing of Glacier is starting to 
look too good to pass up.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt 
Schultz
Sent: Friday, January 11, 2013 8:49 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi Josh,

Glad you are looking into LOCKSS as a potential solution for your needs and 
that you are thinking beyond simple backup solutions for more long-term 
preservation. Here at MetaArchive Cooperative we make use of LOCKSS to preserve 
a range of content/collections from our member institutions.

The nice thing (I think) about our approach and our use of LOCKSS as an 
embedded technology is that you as an institution retain full control over your 
collections in the preservation network and get to play an active and on-going 
part in their preservation treatment over time. Storage costs in MetaArchive 
are competitive ($1/GB/year), and with that you get up to 7 geographic 
replications. MetaArchive is international at this point and so your 
collections really do achieve some safe distance from any disasters that may 
hit close to home.

I'd be more than happy to talk with you further about your collection needs, 
why we like LOCKSS, and any interest your institution may have in being part of 
a collaborative approach to preserving your content above and beyond simple 
backup. Feel free to contact me directly.

Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org 
matt.schu...@metaarchive.org
616-566-3204

On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

 Hi everyone,

 We are starting a digitization project for some of our special 
 collections, and we are having a hard time setting up a backup system 
 that meets the long-term preservation needs of digital archives. The 
 backup mechanisms currently used by campus IT are short-term 
 full-server
backups.
 What we are looking for is more granular, file-level backup over the 
 very long term. Does anyone have any recommendations of software or 
 some service or technique? We are looking into LOCKSS but haven't dug 
 too
deeply yet.
 Can anyone who uses LOCKSS tell me a bit of their experiences with it?

 Josh Welker
 Electronic/Media Services Librarian
 College Liaison
 University Libraries
 Southwest Baptist University
 417.328.1624




--
Matt Schultz
Program 

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Matt Schultz
Josh,

Totally understand the resource constraints and the price comparison
up-front. As Roy alluded to earlier, it pays with Glacier to envision what
your content retrieval scenarios might be, because that $368 up-front could
very easily balloon in situations where you are needing to restore a
collection(s) en-masse at a later date. Amazon Glacier as a service makes
their money on that end. In MetaArchive there is currently no charge for
collection retrieval for the sake of a restoration. You are also subject
and powerless over the long-term to Amazon's price hikes with Glacier.
Because we are a Cooperative, our members collaboratively work together
annually to determine technology preferences, vendors, pricing, cost
control, etc. You have a direct seat at the table to help steer the
solution in your direction.

On Fri, Jan 11, 2013 at 10:09 AM, Joshua Welker jwel...@sbuniv.edu wrote:

 Matt,

 I appreciate the information. At that price, it looks like MetaArchive
 would be a better option than most of the other services mentioned in this
 thread. At this point, I think it is going to come down to a LOCKSS
 solution such as what MetaArchive provides or Amazon Glacier. We anticipate
 our digital collection growing to about 3TB in the first two years. With
 Glacier, that would be $368 per year vs $3,072 per year for MetaArchive and
 LOCKSS. As much as I would like to support library initiatives like LOCKSS,
 we are a small institution with a very small budget, and the pricing of
 Glacier is starting to look too good to pass up.

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Matt Schultz
 Sent: Friday, January 11, 2013 8:49 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Digital collection backups

 Hi Josh,

 Glad you are looking into LOCKSS as a potential solution for your needs
 and that you are thinking beyond simple backup solutions for more long-term
 preservation. Here at MetaArchive Cooperative we make use of LOCKSS to
 preserve a range of content/collections from our member institutions.

 The nice thing (I think) about our approach and our use of LOCKSS as an
 embedded technology is that you as an institution retain full control over
 your collections in the preservation network and get to play an active and
 on-going part in their preservation treatment over time. Storage costs in
 MetaArchive are competitive ($1/GB/year), and with that you get up to 7
 geographic replications. MetaArchive is international at this point and so
 your collections really do achieve some safe distance from any disasters
 that may hit close to home.

 I'd be more than happy to talk with you further about your collection
 needs, why we like LOCKSS, and any interest your institution may have in
 being part of a collaborative approach to preserving your content above and
 beyond simple backup. Feel free to contact me directly.

 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
 matt.schu...@metaarchive.org
 616-566-3204

 On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

  Hi everyone,
 
  We are starting a digitization project for some of our special
  collections, and we are having a hard time setting up a backup system
  that meets the long-term preservation needs of digital archives. The
  backup mechanisms currently used by campus IT are short-term full-server
 backups.
  What we are looking for is more granular, file-level backup over the
  very long term. Does anyone have any recommendations of software or
  some service or technique? We are looking into LOCKSS but haven't dug
 too deeply yet.
  Can anyone who uses LOCKSS tell me a bit of their experiences with it?
 
  Josh Welker
  Electronic/Media Services Librarian
  College Liaison
  University Libraries
  Southwest Baptist University
  417.328.1624
 



 --
 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
 matt.schu...@metaarchive.org
 616-566-3204




-- 
Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative
http://www.metaarchive.org
matt.schu...@metaarchive.org
616-566-3204


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Cary Gordon
Restoring 3 Tb from Glacier is $370. Add about $90 if you use AWS
Import/Export (you provide the device).

Hopefully, this is not something that you would do often.

Cary

On Fri, Jan 11, 2013 at 8:14 AM, Matt Schultz
matt.schu...@metaarchive.org wrote:
 Josh,

 Totally understand the resource constraints and the price comparison
 up-front. As Roy alluded to earlier, it pays with Glacier to envision what
 your content retrieval scenarios might be, because that $368 up-front could
 very easily balloon in situations where you are needing to restore a
 collection(s) en-masse at a later date. Amazon Glacier as a service makes
 their money on that end. In MetaArchive there is currently no charge for
 collection retrieval for the sake of a restoration. You are also subject
 and powerless over the long-term to Amazon's price hikes with Glacier.
 Because we are a Cooperative, our members collaboratively work together
 annually to determine technology preferences, vendors, pricing, cost
 control, etc. You have a direct seat at the table to help steer the
 solution in your direction.

 On Fri, Jan 11, 2013 at 10:09 AM, Joshua Welker jwel...@sbuniv.edu wrote:

 Matt,

 I appreciate the information. At that price, it looks like MetaArchive
 would be a better option than most of the other services mentioned in this
 thread. At this point, I think it is going to come down to a LOCKSS
 solution such as what MetaArchive provides or Amazon Glacier. We anticipate
 our digital collection growing to about 3TB in the first two years. With
 Glacier, that would be $368 per year vs $3,072 per year for MetaArchive and
 LOCKSS. As much as I would like to support library initiatives like LOCKSS,
 we are a small institution with a very small budget, and the pricing of
 Glacier is starting to look too good to pass up.

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Matt Schultz
 Sent: Friday, January 11, 2013 8:49 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Digital collection backups

 Hi Josh,

 Glad you are looking into LOCKSS as a potential solution for your needs
 and that you are thinking beyond simple backup solutions for more long-term
 preservation. Here at MetaArchive Cooperative we make use of LOCKSS to
 preserve a range of content/collections from our member institutions.

 The nice thing (I think) about our approach and our use of LOCKSS as an
 embedded technology is that you as an institution retain full control over
 your collections in the preservation network and get to play an active and
 on-going part in their preservation treatment over time. Storage costs in
 MetaArchive are competitive ($1/GB/year), and with that you get up to 7
 geographic replications. MetaArchive is international at this point and so
 your collections really do achieve some safe distance from any disasters
 that may hit close to home.

 I'd be more than happy to talk with you further about your collection
 needs, why we like LOCKSS, and any interest your institution may have in
 being part of a collaborative approach to preserving your content above and
 beyond simple backup. Feel free to contact me directly.

 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
 matt.schu...@metaarchive.org
 616-566-3204

 On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

  Hi everyone,
 
  We are starting a digitization project for some of our special
  collections, and we are having a hard time setting up a backup system
  that meets the long-term preservation needs of digital archives. The
  backup mechanisms currently used by campus IT are short-term full-server
 backups.
  What we are looking for is more granular, file-level backup over the
  very long term. Does anyone have any recommendations of software or
  some service or technique? We are looking into LOCKSS but haven't dug
 too deeply yet.
  Can anyone who uses LOCKSS tell me a bit of their experiences with it?
 
  Josh Welker
  Electronic/Media Services Librarian
  College Liaison
  University Libraries
  Southwest Baptist University
  417.328.1624
 



 --
 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org
 matt.schu...@metaarchive.org
 616-566-3204




 --
 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative
 http://www.metaarchive.org
 matt.schu...@metaarchive.org
 616-566-3204



-- 
Cary Gordon
The Cherry Hill Company
http://chillco.com


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Aaron Trehub
Hello Josh,

Auburn University is a member of two Private LOCKSS Networks: the MetaArchive 
Cooperative and the Alabama Digital Preservation Network (ADPNet).  Here's a 
link to a recent conference paper that describes both networks, including their 
current pricing structures:

http://conference.ifla.org/past/ifla78/216-trehub-en.pdf

LOCKSS has worked well for us so far, in part because supporting 
community-based solutions is important to us.  As you point out, however, 
Glacier is an attractive alternative, especially for institutions that may be 
more interested in low-cost, low-throughput storage and less concerned about 
entrusting their content to a commercial outfit or having to pay extra to get 
it back out.  As with most things, you pay your money--more or less, 
depending--and make your choice.  And take your risks.

Good luck with whatever solution(s) you decide on.  They need not be mutually 
exclusive.

Best,

Aaron

Aaron Trehub
Assistant Dean for Technology and Technical Services
Auburn University Libraries
231 Mell Street, RBD Library
Auburn, AL 36849-5606
Phone: (334) 844-1716
Skype: ajtrehub
E-mail: treh...@auburn.edu
URL: http://lib.auburn.edu/

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of Joshua 
Welker
Sent: Friday, January 11, 2013 9:09 AM
To: CODE4LIB@listserv.nd.edu
Subject: Re: [CODE4LIB] Digital collection backups

Matt,

I appreciate the information. At that price, it looks like MetaArchive would be 
a better option than most of the other services mentioned in this thread. At 
this point, I think it is going to come down to a LOCKSS solution such as what 
MetaArchive provides or Amazon Glacier. We anticipate our digital collection 
growing to about 3TB in the first two years. With Glacier, that would be $368 
per year vs $3,072 per year for MetaArchive and LOCKSS. As much as I would like 
to support library initiatives like LOCKSS, we are a small institution with a 
very small budget, and the pricing of Glacier is starting to look too good to 
pass up.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Matt 
Schultz
Sent: Friday, January 11, 2013 8:49 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi Josh,

Glad you are looking into LOCKSS as a potential solution for your needs and 
that you are thinking beyond simple backup solutions for more long-term 
preservation. Here at MetaArchive Cooperative we make use of LOCKSS to preserve 
a range of content/collections from our member institutions.

The nice thing (I think) about our approach and our use of LOCKSS as an 
embedded technology is that you as an institution retain full control over your 
collections in the preservation network and get to play an active and on-going 
part in their preservation treatment over time. Storage costs in MetaArchive 
are competitive ($1/GB/year), and with that you get up to 7 geographic 
replications. MetaArchive is international at this point and so your 
collections really do achieve some safe distance from any disasters that may 
hit close to home.

I'd be more than happy to talk with you further about your collection needs, 
why we like LOCKSS, and any interest your institution may have in being part of 
a collaborative approach to preserving your content above and beyond simple 
backup. Feel free to contact me directly.

Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org 
matt.schu...@metaarchive.org
616-566-3204

On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

 Hi everyone,

 We are starting a digitization project for some of our special 
 collections, and we are having a hard time setting up a backup system 
 that meets the long-term preservation needs of digital archives. The 
 backup mechanisms currently used by campus IT are short-term full-server 
 backups.
 What we are looking for is more granular, file-level backup over the 
 very long term. Does anyone have any recommendations of software or 
 some service or technique? We are looking into LOCKSS but haven't dug too 
 deeply yet.
 Can anyone who uses LOCKSS tell me a bit of their experiences with it?

 Josh Welker
 Electronic/Media Services Librarian
 College Liaison
 University Libraries
 Southwest Baptist University
 417.328.1624


--
Matt Schultz
Program Manager
Educopia Institute, MetaArchive Cooperative http://www.metaarchive.org 
matt.schu...@metaarchive.org
616-566-3204


[CODE4LIB] code4lib 2013 location

2013-01-11 Thread Erik Hetzner
Hi all,

Apparently code4lib 2013 is going to be held at the UIC Forum

  http://www.uic.edu/depts/uicforum/

I assumed it would be at the conference hotel. This is just a note so
that others do not make the same assumption, since nowhere in the
information about the conference is the location made clear.

Since the conference hotel is 1 mile from the venue, I assume
transportation will be available.

best, Erik Hetzner
Sent from my free software system http://fsf.org/.


pgpnr9TtfSgBA.pgp
Description: PGP signature


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Thanks, I missed the part about DuraCloud as an abstraction layer. I might look 
into hosting an install of it on the primary server running the digitization 
platform.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tim 
Donohue
Sent: Friday, January 11, 2013 12:39 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi all,

Just wanted to add some additional details about DuraCloud (mentioned earlier 
in this thread), in case it is of interest to anyone.

DuraCloud essentially provides an abstraction layer (as previously
mentioned) above several cloud storage providers.  DuraCloud also provides 
additional preservation services to help manage your content in the cloud (e.g. 
integrity checks, replication across several storage providers, migration 
between storage providers, various health/status reports).

The currently supported cloud storage providers include:
- Amazon S3
- Rackspace
- SDSC

There's several other cloud storage providers which are beta-level or in 
development. These include:
- Amazon Glacier (in development)
- Chronopolis (in development)
- Azure (beta)
- iRODS (beta)
- HP Cloud (beta)

DuraCloud is open source (so you could run it on your own server), but it is 
also offered as a hosted service (through DuraSpace, my employer). 
You can also try out the hosted service for free for two months.

For much more info, see:
- http://www.duracloud.org
- Pricing for hosted service: http://duracloud.org/content/pricing
* The pricing has dropped recently to reflect market changes
- More technical info / documentation: 
https://wiki.duraspace.org/display/DURACLOUD/DuraCloud

If it's of interest, I can put folks in touch with the DuraCloud team for more 
info (or you can email i...@duracloud.org).

- Tim

--
Tim Donohue
Technical Lead for DSpace Project
DuraSpace.org


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
The only scenario I can think of where we'd need to do a full restore is if the 
server crashes, and for those cases, we are going to have typical short-term 
imaging setups in place. Our needs beyond that are to make sure our original 
files are backed up redundantly in some non-volatile location so that in the 
event a file on the local server becomes corrupt, we have a high fidelity copy 
of the original on hand to use to restore it. Since data decay I assume happens 
rather infrequently and over a long period of time, it's not important for us 
to be able to restore all the files at once. Like I said, if the server catches 
on fire and crashes, we have regular off-site tape-based storage to fix those 
short-term problems.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cary 
Gordon
Sent: Friday, January 11, 2013 10:27 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Restoring 3 Tb from Glacier is $370. Add about $90 if you use AWS Import/Export 
(you provide the device).

Hopefully, this is not something that you would do often.

Cary

On Fri, Jan 11, 2013 at 8:14 AM, Matt Schultz matt.schu...@metaarchive.org 
wrote:
 Josh,

 Totally understand the resource constraints and the price comparison 
 up-front. As Roy alluded to earlier, it pays with Glacier to envision 
 what your content retrieval scenarios might be, because that $368 
 up-front could very easily balloon in situations where you are needing 
 to restore a
 collection(s) en-masse at a later date. Amazon Glacier as a service 
 makes their money on that end. In MetaArchive there is currently no 
 charge for collection retrieval for the sake of a restoration. You are 
 also subject and powerless over the long-term to Amazon's price hikes with 
 Glacier.
 Because we are a Cooperative, our members collaboratively work 
 together annually to determine technology preferences, vendors, 
 pricing, cost control, etc. You have a direct seat at the table to 
 help steer the solution in your direction.

 On Fri, Jan 11, 2013 at 10:09 AM, Joshua Welker jwel...@sbuniv.edu wrote:

 Matt,

 I appreciate the information. At that price, it looks like 
 MetaArchive would be a better option than most of the other services 
 mentioned in this thread. At this point, I think it is going to come 
 down to a LOCKSS solution such as what MetaArchive provides or Amazon 
 Glacier. We anticipate our digital collection growing to about 3TB in 
 the first two years. With Glacier, that would be $368 per year vs 
 $3,072 per year for MetaArchive and LOCKSS. As much as I would like 
 to support library initiatives like LOCKSS, we are a small 
 institution with a very small budget, and the pricing of Glacier is starting 
 to look too good to pass up.

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf 
 Of Matt Schultz
 Sent: Friday, January 11, 2013 8:49 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Digital collection backups

 Hi Josh,

 Glad you are looking into LOCKSS as a potential solution for your 
 needs and that you are thinking beyond simple backup solutions for 
 more long-term preservation. Here at MetaArchive Cooperative we make 
 use of LOCKSS to preserve a range of content/collections from our member 
 institutions.

 The nice thing (I think) about our approach and our use of LOCKSS as 
 an embedded technology is that you as an institution retain full 
 control over your collections in the preservation network and get to 
 play an active and on-going part in their preservation treatment over 
 time. Storage costs in MetaArchive are competitive ($1/GB/year), and 
 with that you get up to 7 geographic replications. MetaArchive is 
 international at this point and so your collections really do achieve 
 some safe distance from any disasters that may hit close to home.

 I'd be more than happy to talk with you further about your collection 
 needs, why we like LOCKSS, and any interest your institution may have 
 in being part of a collaborative approach to preserving your content 
 above and beyond simple backup. Feel free to contact me directly.

 Matt Schultz
 Program Manager
 Educopia Institute, MetaArchive Cooperative 
 http://www.metaarchive.org matt.schu...@metaarchive.org
 616-566-3204

 On Thu, Jan 10, 2013 at 5:20 PM, Joshua Welker jwel...@sbuniv.edu wrote:

  Hi everyone,
 
  We are starting a digitization project for some of our special 
  collections, and we are having a hard time setting up a backup 
  system that meets the long-term preservation needs of digital 
  archives. The backup mechanisms currently used by campus IT are 
  short-term full-server
 backups.
  What we are looking for is more granular, file-level backup over 
  the very long term. Does anyone have any recommendations of 
  software or some service or technique? We are looking into LOCKSS 
  but haven't dug
 

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Thanks for bringing up the issue of the cost of making sure the data is 
consistent. We will be using DSpace for now, and I know DSpace has some 
checksum functionality built in out-of-the-box. It shouldn't be too difficult 
to write a script that loops through DSpace's checksum data and compares it 
against the files in Glacier. Reading the Glacier FAQ on Amazon's site, it 
looks like they provide an archive inventory (updated daily) that can be 
downloaded as JSON. I read some users saying that this inventory includes 
checksum data. So hopefully it will just be a matter of comparing the local 
checksum to the Glacier checksum, and that would be easy enough to script.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ryan Eby
Sent: Friday, January 11, 2013 11:37 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

As Aaron alludes to your decision should base off your real needs and they 
might not be exclusive.

LOCKSS/MetaArchive might be worth the money if it is the community archival 
aspect you are going for. Depending on your institution being a participant 
might make political/mission sense regardless of the storage needs and it could 
just be a specific collection that makes sense.

Glacier is a great choice if you are looking for spreading a backup across 
regions. S3 similarly if you also want to benefit from CloudFront (the CDN
setup) to take load off your institutions server (you can now use cloudfront 
off your own origin server as well). Depending on your bandwidth this might be 
worth the money regardless of LOCKSS participation (which can be more dark). 
Amazon also tends to be dropping prices over time vs raising but as any 
outsource you have to plan that it might not exist in the future. Also look 
more at Glacier prices in terms of checking your data for consistency. There 
have been a few papers on the costs of making sure Amazon really has the proper 
data depending on how often your requirements want you to check.

Another option if you are just looking for more geo placement is finding an 
institution or service provider that will colocate. There may be another small 
institution that would love to shove a cheap box with hard drives on your 
network in exchange for the same. Not as involved/formal as LOCKSS but gives 
you something you control to satisfy your requirements. It could also be as low 
tech as shipping SSDs to another institution who then runs some bagit checksums 
on the drive, etc.

All of the above should be scriptable in your workflow. Just need to decide 
what you really want out of it.

Eby


On Fri, Jan 11, 2013 at 11:52 AM, Aaron Trehub treh...@auburn.edu wrote:

 Hello Josh,

 Auburn University is a member of two Private LOCKSS Networks: the 
 MetaArchive Cooperative and the Alabama Digital Preservation Network 
 (ADPNet).  Here's a link to a recent conference paper that describes 
 both networks, including their current pricing structures:

 http://conference.ifla.org/past/ifla78/216-trehub-en.pdf

 LOCKSS has worked well for us so far, in part because supporting 
 community-based solutions is important to us.  As you point out, 
 however, Glacier is an attractive alternative, especially for 
 institutions that may be more interested in low-cost, low-throughput 
 storage and less concerned about entrusting their content to a 
 commercial outfit or having to pay extra to get it back out.  As with 
 most things, you pay your money--more or less, depending--and make your 
 choice.  And take your risks.

 Good luck with whatever solution(s) you decide on.  They need not be 
 mutually exclusive.

 Best,

 Aaron

 Aaron Trehub
 Assistant Dean for Technology and Technical Services Auburn University 
 Libraries
 231 Mell Street, RBD Library
 Auburn, AL 36849-5606
 Phone: (334) 844-1716
 Skype: ajtrehub
 E-mail: treh...@auburn.edu
 URL: http://lib.auburn.edu/




Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Francis Kayiwa
On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
 Hi all,
 
 Apparently code4lib 2013 is going to be held at the UIC Forum
 
   http://www.uic.edu/depts/uicforum/
 
 I assumed it would be at the conference hotel. This is just a note so
 that others do not make the same assumption, since nowhere in the
 information about the conference is the location made clear.
 
 Since the conference hotel is 1 mile from the venue, I assume
 transportation will be available.

That's a good assumption to make. As to the confusion  I said to you
when you asked me about this a couple of days ago.

http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
proposal. If you look at the document it also suggests that we were
going to have the conference registration staggered by timezones. We
have elected not to update that because as that was our proposal. When
preparing our proposal we borrowed heavily from Yale's and IU's proposal
and if someone would like to steal from us I think it is fair to leave
that as is.

If you want the conference page use the lanyrd.com link below. I can't
even take credit for doing that. All of that goes to @pberry

http://lanyrd.com/2013/c4l13/

Cheers,
./fxk



 
 best, Erik Hetzner

 Sent from my free software system http://fsf.org/.




-- 
Speed is subsittute fo accurancy.


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Thomas Kula
On Fri, Jan 11, 2013 at 07:45:21PM +, Joshua Welker wrote:
 Thanks for bringing up the issue of the cost of making sure the data is 
 consistent. We will be using DSpace for now, and I know DSpace has some 
 checksum functionality built in out-of-the-box. It shouldn't be too difficult 
 to write a script that loops through DSpace's checksum data and compares it 
 against the files in Glacier. Reading the Glacier FAQ on Amazon's site, it 
 looks like they provide an archive inventory (updated daily) that can be 
 downloaded as JSON. I read some users saying that this inventory includes 
 checksum data. So hopefully it will just be a matter of comparing the local 
 checksum to the Glacier checksum, and that would be easy enough to script.

An important question to ask here, though, is if that included checksum
data is the same that Amazon uses to perform the systematic data
integrity checks they mention in the Glacier FAQ, or if it's just
catalog data --- here's the checksum when we put it in. This is always
the question we run into when we consider services like this, can we
tease enough information out to convince ourselves that their checking
is sufficient. 

--
Thomas L. Kula | tlk2...@columbia.edu
Systems Engineer | Library Information Technology Office
The Libraries, Columbia University in the City of New York


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Tim Donohue

Hi Josh,

Now that you bring up DSpace as being part of the equation...

You might want to look at the newly released Replication Task Suite 
plugin/addon for DSpace (supports DSpace versions 1.8.x  3.0):


https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite

This DSpace plugin does essentially what you are talking about...

It allows you to backup (i.e. replicate) DSpace content files and 
metadata (in the form of a set of AIPs, Archival Information Packages) 
to a local filesystem/drive or to cloud storage.  Plus it provides an 
auditing tool to audit changes between DSpace and the cloud storage 
provider.  Currently, for the Replication Task Suite, that only cloud 
storage plugin we have created is for DuraCloud. But, it wouldn't be too 
hard to create a new plugin for Glacier (if you wanted to send DSpace 
content directly to Glacier without DuraCloud in between).


The code is in GitHub at:
https://github.com/DSpace/dspace-replicate

If you decide to use it and create anything cool, feel free to send us a 
pull request.


Good luck,

- Tim

--
Tim Donohue
Technical Lead for DSpace Project
DuraSpace.org

On 1/11/2013 1:45 PM, Joshua Welker wrote:

Thanks for bringing up the issue of the cost of making sure the data is 
consistent. We will be using DSpace for now, and I know DSpace has some 
checksum functionality built in out-of-the-box. It shouldn't be too difficult 
to write a script that loops through DSpace's checksum data and compares it 
against the files in Glacier. Reading the Glacier FAQ on Amazon's site, it 
looks like they provide an archive inventory (updated daily) that can be 
downloaded as JSON. I read some users saying that this inventory includes 
checksum data. So hopefully it will just be a matter of comparing the local 
checksum to the Glacier checksum, and that would be easy enough to script.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ryan Eby
Sent: Friday, January 11, 2013 11:37 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

As Aaron alludes to your decision should base off your real needs and they 
might not be exclusive.

LOCKSS/MetaArchive might be worth the money if it is the community archival 
aspect you are going for. Depending on your institution being a participant 
might make political/mission sense regardless of the storage needs and it could 
just be a specific collection that makes sense.

Glacier is a great choice if you are looking for spreading a backup across 
regions. S3 similarly if you also want to benefit from CloudFront (the CDN
setup) to take load off your institutions server (you can now use cloudfront 
off your own origin server as well). Depending on your bandwidth this might be 
worth the money regardless of LOCKSS participation (which can be more dark). 
Amazon also tends to be dropping prices over time vs raising but as any 
outsource you have to plan that it might not exist in the future. Also look 
more at Glacier prices in terms of checking your data for consistency. There 
have been a few papers on the costs of making sure Amazon really has the proper 
data depending on how often your requirements want you to check.

Another option if you are just looking for more geo placement is finding an 
institution or service provider that will colocate. There may be another small 
institution that would love to shove a cheap box with hard drives on your 
network in exchange for the same. Not as involved/formal as LOCKSS but gives 
you something you control to satisfy your requirements. It could also be as low 
tech as shipping SSDs to another institution who then runs some bagit checksums 
on the drive, etc.

All of the above should be scriptable in your workflow. Just need to decide 
what you really want out of it.

Eby


On Fri, Jan 11, 2013 at 11:52 AM, Aaron Trehub treh...@auburn.edu wrote:


Hello Josh,

Auburn University is a member of two Private LOCKSS Networks: the
MetaArchive Cooperative and the Alabama Digital Preservation Network
(ADPNet).  Here's a link to a recent conference paper that describes
both networks, including their current pricing structures:

http://conference.ifla.org/past/ifla78/216-trehub-en.pdf

LOCKSS has worked well for us so far, in part because supporting
community-based solutions is important to us.  As you point out,
however, Glacier is an attractive alternative, especially for
institutions that may be more interested in low-cost, low-throughput
storage and less concerned about entrusting their content to a
commercial outfit or having to pay extra to get it back out.  As with
most things, you pay your money--more or less, depending--and make your choice. 
 And take your risks.

Good luck with whatever solution(s) you decide on.  They need not be
mutually exclusive.

Best,

Aaron

Aaron Trehub
Assistant Dean for Technology and Technical Services Auburn University
Libraries
231 Mell Street, RBD Library

[CODE4LIB] Job Posting / Metadata Specialist / Washington, DC

2013-01-11 Thread Suzanne Richards
Apologies for the cross postings  . . . . . . .

LAC Group is seeking a Metadata Specialist to work on a long-term contract for 
a prestigious government agency located in Washington, DC.   This position 
includes reconciling existing schemas and vocabularies to create an enterprise 
schema and vocabulary using appropriate software tools.  A successful candidate 
will have experience in crisp execution of projects; understand the role of 
standards, structure, content analysis, context, user roles; and information 
lifecycle management. The candidate is expected to understand and help shape 
the organization's content strategy, including findability, discovery, 
usability, dissemination, etc., as well as basic information management 
principles.

Responsibilities:

§  Develop schemas to standardize the business semantics of the agency;

§  Assist in developing content strategies for facilitate interoperability, 
exchange, findability, discovery, usability, etc.;

§  Evaluate software tools to manage the agency's business schema;

§  Ability to manage projects crisply and produce deliverables on time and 
within budget.

Qualifications:

§  Master's Degree in Library and Information Science or equivalent work 
experience in business or non-profit sectors;

§  Significant expertise in developing/managing business semantics, business 
vocabularies, content strategies, business process analysis, systems analysis, 
etc.;

§  Experience in the areas of content modeling, content analysis, business 
process analysis, etc.;

§  Understanding of emerging information services/technology trends a plus;

§  Ability to balance business requests with users growing needs in the 
development and growth of system taxonomies and metadata;

§  Experience with Controlled Reference Sources;

§  Business Analysis experience - attending requirement gathering sessions with 
Stakeholders, extracting information, and managing the requirements process;

§  Builds and maintains strong relationships with team members to meet 
organizational goals as well as strong sense of urgency with excellent 
organizational and time management skills;

§  Excellent analytical and communication skills;

§  Creative problem solving abilities;

§  Ability to work effectively in a multicultural, multi-project environment 
and ability to respond immediately to often changing business priorities;

§  Knowledge of a second language is a plus.

Immediate consideration, apply at http://goo.gl/Bd0YB
LAC Group is an Equal Opportunity/Affirmative Action employer and values 
diversity in the workforce.
LAC Group is a premier provider of recruiting and consultancy services for 
information professionals at U.S. and global organizations including Fortune 
100 companies, law firms, pharmaceutical companies, large academic institutions 
and prominent government agencies.


Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Joshua Welker
Awesome! Thanks. I will look into this for sure.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tim 
Donohue
Sent: Friday, January 11, 2013 2:30 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

Hi Josh,

Now that you bring up DSpace as being part of the equation...

You might want to look at the newly released Replication Task Suite 
plugin/addon for DSpace (supports DSpace versions 1.8.x  3.0):

https://wiki.duraspace.org/display/DSPACE/ReplicationTaskSuite

This DSpace plugin does essentially what you are talking about...

It allows you to backup (i.e. replicate) DSpace content files and metadata (in 
the form of a set of AIPs, Archival Information Packages) to a local 
filesystem/drive or to cloud storage.  Plus it provides an auditing tool to 
audit changes between DSpace and the cloud storage provider.  Currently, for 
the Replication Task Suite, that only cloud storage plugin we have created is 
for DuraCloud. But, it wouldn't be too hard to create a new plugin for Glacier 
(if you wanted to send DSpace content directly to Glacier without DuraCloud in 
between).

The code is in GitHub at:
https://github.com/DSpace/dspace-replicate

If you decide to use it and create anything cool, feel free to send us a pull 
request.

Good luck,

- Tim

--
Tim Donohue
Technical Lead for DSpace Project
DuraSpace.org

On 1/11/2013 1:45 PM, Joshua Welker wrote:
 Thanks for bringing up the issue of the cost of making sure the data is 
 consistent. We will be using DSpace for now, and I know DSpace has some 
 checksum functionality built in out-of-the-box. It shouldn't be too difficult 
 to write a script that loops through DSpace's checksum data and compares it 
 against the files in Glacier. Reading the Glacier FAQ on Amazon's site, it 
 looks like they provide an archive inventory (updated daily) that can be 
 downloaded as JSON. I read some users saying that this inventory includes 
 checksum data. So hopefully it will just be a matter of comparing the local 
 checksum to the Glacier checksum, and that would be easy enough to script.

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf 
 Of Ryan Eby
 Sent: Friday, January 11, 2013 11:37 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Digital collection backups

 As Aaron alludes to your decision should base off your real needs and they 
 might not be exclusive.

 LOCKSS/MetaArchive might be worth the money if it is the community archival 
 aspect you are going for. Depending on your institution being a participant 
 might make political/mission sense regardless of the storage needs and it 
 could just be a specific collection that makes sense.

 Glacier is a great choice if you are looking for spreading a backup 
 across regions. S3 similarly if you also want to benefit from 
 CloudFront (the CDN
 setup) to take load off your institutions server (you can now use cloudfront 
 off your own origin server as well). Depending on your bandwidth this might 
 be worth the money regardless of LOCKSS participation (which can be more 
 dark). Amazon also tends to be dropping prices over time vs raising but as 
 any outsource you have to plan that it might not exist in the future. Also 
 look more at Glacier prices in terms of checking your data for consistency. 
 There have been a few papers on the costs of making sure Amazon really has 
 the proper data depending on how often your requirements want you to check.

 Another option if you are just looking for more geo placement is finding an 
 institution or service provider that will colocate. There may be another 
 small institution that would love to shove a cheap box with hard drives on 
 your network in exchange for the same. Not as involved/formal as LOCKSS but 
 gives you something you control to satisfy your requirements. It could also 
 be as low tech as shipping SSDs to another institution who then runs some 
 bagit checksums on the drive, etc.

 All of the above should be scriptable in your workflow. Just need to decide 
 what you really want out of it.

 Eby


 On Fri, Jan 11, 2013 at 11:52 AM, Aaron Trehub treh...@auburn.edu wrote:

 Hello Josh,

 Auburn University is a member of two Private LOCKSS Networks: the 
 MetaArchive Cooperative and the Alabama Digital Preservation Network 
 (ADPNet).  Here's a link to a recent conference paper that describes 
 both networks, including their current pricing structures:

 http://conference.ifla.org/past/ifla78/216-trehub-en.pdf

 LOCKSS has worked well for us so far, in part because supporting 
 community-based solutions is important to us.  As you point out, 
 however, Glacier is an attractive alternative, especially for 
 institutions that may be more interested in low-cost, low-throughput 
 storage and less concerned about entrusting their content to a 
 commercial outfit or having to pay extra to get 

Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Patrick Berry
I'll take this opportunity to remind folks that if you spot anything amiss,
please let me know (or sign up and fix it!) and I will clean it up.  Thanks!


On Fri, Jan 11, 2013 at 11:51 AM, Francis Kayiwa kay...@uic.edu wrote:

 On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
  Hi all,
 
  Apparently code4lib 2013 is going to be held at the UIC Forum
 
http://www.uic.edu/depts/uicforum/
 
  I assumed it would be at the conference hotel. This is just a note so
  that others do not make the same assumption, since nowhere in the
  information about the conference is the location made clear.
 
  Since the conference hotel is 1 mile from the venue, I assume
  transportation will be available.

 That's a good assumption to make. As to the confusion  I said to you
 when you asked me about this a couple of days ago.

 http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
 proposal. If you look at the document it also suggests that we were
 going to have the conference registration staggered by timezones. We
 have elected not to update that because as that was our proposal. When
 preparing our proposal we borrowed heavily from Yale's and IU's proposal
 and if someone would like to steal from us I think it is fair to leave
 that as is.

 If you want the conference page use the lanyrd.com link below. I can't
 even take credit for doing that. All of that goes to @pberry

 http://lanyrd.com/2013/c4l13/

 Cheers,
 ./fxk



 
  best, Erik Hetzner

  Sent from my free software system http://fsf.org/.




 --
 Speed is subsittute fo accurancy.



Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread ddwiggins
Be careful about assuming too much on this.
 
When I started working with S3, the system required an MD5 sum to upload, and 
would respond to requests with this etag in the header as well. I therefor 
assumed that this was integral to the system, and was a good way to compare 
local files against the remote copies.
 
Then, maybe a year or two ago, Amazon introduced chunked uploads, so that you 
could send files in pieces and reassemble them once they got to S3. This was 
good, because it eliminated problems with huge files failing to upload due to 
network hicups. I went ahead and implemented it in my scripts. Then, all of a 
sudden I started getting invalid checksums. Turns out that for multipart file 
uploads, they now create etag identifiers that are not the md5 sum of the 
underlying files. 
 
I now store the checksum as a separate piece of header metadata. And my sync 
script does periodically compare against this. But since this is just metadata, 
checking it doesn't really prove anything about the underlying file that Amazon 
has. To do this I would need to write a script that would actually retrieve the 
file and rerun the checksum. I have not done this yet, although it is on my 
to-do list at some point. This would ideally happen on an Amazon server so that 
I wouldn't have to send the file back and forth.
 
In any case, my main point is: don't assume that you can just check against a 
checksum from the API to verify a file for digital preservation purposes.
 
-David
 
 
 
 
 
__
 
David Dwiggins
Systems Librarian/Archivist, Historic New England
141 Cambridge Street, Boston, MA 02114
(617) 994-5948
ddwigg...@historicnewengland.org
http://www.historicnewengland.org
 Joshua Welker jwel...@sbuniv.edu 1/11/2013 2:45 PM 
Thanks for bringing up the issue of the cost of making sure the data is 
consistent. We will be using DSpace for now, and I know DSpace has some 
checksum functionality built in out-of-the-box. It shouldn't be too difficult 
to write a script that loops through DSpace's checksum data and compares it 
against the files in Glacier. Reading the Glacier FAQ on Amazon's site, it 
looks like they provide an archive inventory (updated daily) that can be 
downloaded as JSON. I read some users saying that this inventory includes 
checksum data. So hopefully it will just be a matter of comparing the local 
checksum to the Glacier checksum, and that would be easy enough to script.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ryan Eby
Sent: Friday, January 11, 2013 11:37 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Digital collection backups

As Aaron alludes to your decision should base off your real needs and they 
might not be exclusive.

LOCKSS/MetaArchive might be worth the money if it is the community archival 
aspect you are going for. Depending on your institution being a participant 
might make political/mission sense regardless of the storage needs and it could 
just be a specific collection that makes sense.

Glacier is a great choice if you are looking for spreading a backup across 
regions. S3 similarly if you also want to benefit from CloudFront (the CDN
setup) to take load off your institutions server (you can now use cloudfront 
off your own origin server as well). Depending on your bandwidth this might be 
worth the money regardless of LOCKSS participation (which can be more dark). 
Amazon also tends to be dropping prices over time vs raising but as any 
outsource you have to plan that it might not exist in the future. Also look 
more at Glacier prices in terms of checking your data for consistency. There 
have been a few papers on the costs of making sure Amazon really has the proper 
data depending on how often your requirements want you to check.

Another option if you are just looking for more geo placement is finding an 
institution or service provider that will colocate. There may be another small 
institution that would love to shove a cheap box with hard drives on your 
network in exchange for the same. Not as involved/formal as LOCKSS but gives 
you something you control to satisfy your requirements. It could also be as low 
tech as shipping SSDs to another institution who then runs some bagit checksums 
on the drive, etc.

All of the above should be scriptable in your workflow. Just need to decide 
what you really want out of it.

Eby


On Fri, Jan 11, 2013 at 11:52 AM, Aaron Trehub treh...@auburn.edu wrote:

 Hello Josh,

 Auburn University is a member of two Private LOCKSS Networks: the 
 MetaArchive Cooperative and the Alabama Digital Preservation Network 
 (ADPNet).  Here's a link to a recent conference paper that describes 
 both networks, including their current pricing structures:

 http://conference.ifla.org/past/ifla78/216-trehub-en.pdf

 LOCKSS has worked well for us so far, in part because supporting 
 community-based solutions is important to us.  As you point out, 
 

Re: [CODE4LIB] Digital collection backups

2013-01-11 Thread Randy Fischer
On Fri, Jan 11, 2013 at 2:45 PM, Joshua Welker jwel...@sbuniv.edu wrote:

 Reading the Glacier FAQ on Amazon's site, it looks like they provide an
 archive inventory (updated daily) that can be downloaded as JSON. I read
 some users saying that this inventory includes checksum data. So hopefully
 it will just be a matter of comparing the local checksum to the Glacier
 checksum, and that would be easy enough to script.



One could also occasionally spin up local EC2 instances to do the checksums
in the same data center, and ship just that metadata down - you would not
incur any bulk transfer costs in that case (if memory serves).   DAITSS
uses both md5 and sha1 checksums in combination, other preservation systems
might require similar.

-Randy Fischer


Re: [CODE4LIB] XMP Metadata to tab-delemited file

2013-01-11 Thread Misty De Meo
Hi, Andrea,


XMP is natively an RDF-based format, so getting out XML isn't hard at all.
You have a couple of XML-based options with exiftool:

exiftool -X foo.jpg # prints the metadata in exiftool's own RDF/XML schema
to stdout
exiftool -tagsfromfile foo.jpg -o foo.xmp # writes the metadata in an XMP
XML file
Exiftool -tagsfromfile foo.jpg -o -.xmp # writes the metadata in XMP XML
to stdout; only works in recentish versions of exiftool

exiftool also has a CSV output that might be helpful to you; check
`exiftool --help` for details on how that works.

Misty

On 13-01-10 11:32 AM, Medina-Smith, Andrea
andrea.medina-sm...@nist.gov wrote:

I can get the data out, and I can even get a single file created w/ all
the metadata for all the images in the collection. It's just that it is
unstructured and not useful as such. Anything xml would also be useful,
but I haven't found a product that does that.

I was really trying not to just call you up ;)

-a 

__
_
Andrea Medina-Smith
Metadata Librarian
NIST Gaithersburg
andrea.medina-sm...@nist.gov
301-975-2592

Be Green! Think before you print this email.


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
ddwigg...@historicnewengland.org
Sent: Thursday, January 10, 2013 12:02 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] XMP Metadata to tab-delemited file

ResourceSpace does this internally to extract metadata. I think it's as
simple as 
 
exiftool -t -s imagefile.tif  metadatafile.tab
 
Does this do what you want?
 
-DD

 
__
 
David Dwiggins
Systems Librarian/Archivist, Historic New England
141 Cambridge Street, Boston, MA 02114
(617) 994-5948
ddwigg...@historicnewengland.org
http://www.historicnewengland.org
 Medina-Smith, Andrea andrea.medina-sm...@nist.gov 1/10/2013 10:57
AM 
Hello,

I need to take xmp metadata that is imbedded in tif images and pull it
out into a tab delimited text file for ingest into our digital repository
(CONTENTdm). Has anyone done this using exiftool or the like?

Thanks,
A

__
_
Andrea Medina-Smith
Metadata Librarian
NIST Gaithersburg
andrea.medina-sm...@nist.gov
301-975-2592

Be Green! Think before you print this email.


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Cynthia Ng
I'm sorry, but that doesn't actually clear up anything for me. The
location on the layrd page just says Chicago. So, is the conference
still happening at UIC? Since the conference hotel isn't super close,
does that mean there will be transportation provided?

While we're on the subject, are the pre-conferences happening at the
same location?

On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
 On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
 Hi all,

 Apparently code4lib 2013 is going to be held at the UIC Forum

   http://www.uic.edu/depts/uicforum/

 I assumed it would be at the conference hotel. This is just a note so
 that others do not make the same assumption, since nowhere in the
 information about the conference is the location made clear.

 Since the conference hotel is 1 mile from the venue, I assume
 transportation will be available.

 That's a good assumption to make. As to the confusion  I said to you
 when you asked me about this a couple of days ago.

 http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
 proposal. If you look at the document it also suggests that we were
 going to have the conference registration staggered by timezones. We
 have elected not to update that because as that was our proposal. When
 preparing our proposal we borrowed heavily from Yale's and IU's proposal
 and if someone would like to steal from us I think it is fair to leave
 that as is.

 If you want the conference page use the lanyrd.com link below. I can't
 even take credit for doing that. All of that goes to @pberry

 http://lanyrd.com/2013/c4l13/

 Cheers,
 ./fxk




 best, Erik Hetzner

 Sent from my free software system http://fsf.org/.




 --
 Speed is subsittute fo accurancy.


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Patrick Berry
So, the location field on Lanyrd is not super specific.  It likes big
things, like cities.  On the reg page for code4lib the UIC Forum is listed
as the location though.

http://www.regonline.com/builder/site/Default.aspx?EventID=1167723

I'll see if I can put that somewhere in the event on Lanyrd.

Cheers,
Pat


On Fri, Jan 11, 2013 at 3:41 PM, Cynthia Ng cynthia.s...@gmail.com wrote:

 I'm sorry, but that doesn't actually clear up anything for me. The
 location on the layrd page just says Chicago. So, is the conference
 still happening at UIC? Since the conference hotel isn't super close,
 does that mean there will be transportation provided?

 While we're on the subject, are the pre-conferences happening at the
 same location?

 On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
  On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
  Hi all,
 
  Apparently code4lib 2013 is going to be held at the UIC Forum
 
http://www.uic.edu/depts/uicforum/
 
  I assumed it would be at the conference hotel. This is just a note so
  that others do not make the same assumption, since nowhere in the
  information about the conference is the location made clear.
 
  Since the conference hotel is 1 mile from the venue, I assume
  transportation will be available.
 
  That's a good assumption to make. As to the confusion  I said to you
  when you asked me about this a couple of days ago.
 
  http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
  proposal. If you look at the document it also suggests that we were
  going to have the conference registration staggered by timezones. We
  have elected not to update that because as that was our proposal. When
  preparing our proposal we borrowed heavily from Yale's and IU's proposal
  and if someone would like to steal from us I think it is fair to leave
  that as is.
 
  If you want the conference page use the lanyrd.com link below. I can't
  even take credit for doing that. All of that goes to @pberry
 
  http://lanyrd.com/2013/c4l13/
 
  Cheers,
  ./fxk
 
 
 
 
  best, Erik Hetzner
 
  Sent from my free software system http://fsf.org/.
 
 
 
 
  --
  Speed is subsittute fo accurancy.



[CODE4LIB] Job: Tools Lab Operations Engineer (Contractor) at Wikimedia Foundation

2013-01-11 Thread jobs
**Background Information**  
  
The Wikimedia Foundation, Inc. is a nonprofit charitable organization
dedicated to the growth, development and distribution of free, multilingual
content, and to providing the full content of these wiki-based projects to the
public free of charge. The Wikimedia Foundation operates some of the largest
collaboratively edited reference projects in the world, including Wikipedia, a
top-ten internet property.

  
  
**Statement of Purpose**  
  
The Technical Operations team of the Wikimedia Foundation is embarking on a
new project to build a flexible and scalable lab infrastructure for our
community and volunteers, to support their effort to prototype, develop, test
and deploy their tools and extensions. Some of the uses of the Wikimedia Labs
infrastructure are for:

  

  * Deployment of volunteer-created tools which are independent of MediaWiki, 
e.g. edit counters, mentoring database, geographic information about articles 
etc. (essentially the kind of things currently running on the Toolserver)
  * Prototyping and staging of WMF-developed MediaWiki code
  * Prototyping and staging of volunteer or chapter-developed MediaWiki code
  * Development and deployment of new site architecture by staff and volunteers 
in a code-reviewed, devops-oriented environment
  * Access for researchers (WMF or external) to live database replication or 
large datasets, as well as computing resources, for the purpose of running 
analyses
  * Serving as an execution and hosting space for bots, so that bots can be 
more systematically developed and tracked
Two full-time Wikimedia Foundation operations engineers are currently building
Wikimedia Labs.

  
  
**Scope of Work**  
  
Wikimedia is looking for a contractor whose primary focus will be to assist
the community developers to migrate their tools to this new Labs
infrastructure, especially those residing in Toolserver today. In addition,
this person will:

  * Support enhancement and perform operational duties of the Labs 
Virtualization Project using OpenStack and LAMP-stack technology. Duties 
include developing, deploying and supporting tools to provision and manage 
large networks of virtual machines, creating a redundant and scalable cloud 
computing platform
  * Set up monitoring systems
  * Provide system and database administration duties for the Labs environment
  
**Outcome and Performance Standards**  
  
You are expected to work about 40 hours a week, on average. During these
(flexible) hours you are required to be available online for collaboration
with the (international) Foundation team. Outside these hours, you may
incidentally be contacted for emergencies (e.g. during system outages). You
will report to the Director of Operations, and will work closely with
Operations staff, the Engineering Community Team, and the Toolserver
community. Besides maintaining regular communication with your point of
contact, you may need to participate in bi-weekly online Operations meetings
with the rest of the team. There will be milestone check-ins with the
Foundation to discuss progress and activities. You must be willing to travel
occasionally for international meetings, as well as to perform your duties.

  
  
**Term of Contract**  
  
Your initial contract will be for a duration of 6 months, and will commence as
soon as possible. Renegotiation at the termination of the contract is
optional.

  
  
**Payments, Incentives, and Penalties**  
  
Rate will be determined by level of experience and expertise.

  
  
**Contractual Terms and Conditions and Required Qualifications**  
  
Respondent parties are expected to:

  

  * Have 5+ years of hands-on and strong knowledge of LAMP-stack system 
administration
  * Be competent in programming and scripting languages like PHP, Python and 
bash
  * Be able to work independently where needed, and work remotely as part of a 
globally distributed team
  * Be comfortable in a highly collaborative, consensus-oriented environment
  * Be a proficient speaker in the English language
Furthermore:

  

  * Prior work experience in creating provisioning tools is a plus
  * Prior work experience integrating different types of services together, 
e.g., LDAP, Puppet and MediaWiki is a plus
  * Experience with virtualization technologies such as OpenStack or Ganeti is 
a plus
  * Experience with clustered filesystems such as GlusterFS or Swift is a plus
  * Experience with high-traffic web site operations is a plus
  * Experience with MySQL database administration is a plus
  * Experience with the Solaris UNIX operating system and Sun Grid Engine is a 
plus
  * Understanding of the free culture movement, especially Wikimedia, is a plus
The ideal candidate will be creative, highly motivated, and able to operate
effectively in multiple cultural contexts.

  
Candidates do not have to live in the San Francisco Bay Area or the USA;
remote candidates are welcome.



Brought to you by code4lib jobs: 

Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Francis Kayiwa
On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
 I'm sorry, but that doesn't actually clear up anything for me. The
 location on the layrd page just says Chicago. So, is the conference
 still happening at UIC? Since the conference hotel isn't super close,
 does that mean there will be transportation provided?

The entire conference and pre-conference is at UIC. The Forum is a
revenue generating part of UIC. The pre-conference will be at the
University Libraries on Monday with the exception of the Drupal one. 

The hotel is a mile or thereabouts from UIC Forum. Here is the problem
with us natives planning. It never crossed our minds that walking a mile
while on the *upper limit* of our shuttling to and from work is not the
norm for everyone. This was brought to our attention and we will have a
shuttle from the Hotel to the Conference venue. 

 
 While we're on the subject, are the pre-conferences happening at the
 same location?


See above.

./fxk

 
 On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
  On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
  Hi all,
 
  Apparently code4lib 2013 is going to be held at the UIC Forum
 
http://www.uic.edu/depts/uicforum/
 
  I assumed it would be at the conference hotel. This is just a note so
  that others do not make the same assumption, since nowhere in the
  information about the conference is the location made clear.
 
  Since the conference hotel is 1 mile from the venue, I assume
  transportation will be available.
 
  That's a good assumption to make. As to the confusion  I said to you
  when you asked me about this a couple of days ago.
 
  http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
  proposal. If you look at the document it also suggests that we were
  going to have the conference registration staggered by timezones. We
  have elected not to update that because as that was our proposal. When
  preparing our proposal we borrowed heavily from Yale's and IU's proposal
  and if someone would like to steal from us I think it is fair to leave
  that as is.
 
  If you want the conference page use the lanyrd.com link below. I can't
  even take credit for doing that. All of that goes to @pberry
 
  http://lanyrd.com/2013/c4l13/
 
  Cheers,
  ./fxk
 
 
 
 
  best, Erik Hetzner
 
  Sent from my free software system http://fsf.org/.
 
 
 
 
  --
  Speed is subsittute fo accurancy.
 

-- 
Speed is subsittute fo accurancy.


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Bill Dueber
Because it seems like it might be useful, I've started a publicly-editable
google map at

http://goo.gl/maps/LWqay

Right now, it has two points: the hotel and the conference location. Please
add stuff as appropriate if the urge strikes you.




On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:

 On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
  I'm sorry, but that doesn't actually clear up anything for me. The
  location on the layrd page just says Chicago. So, is the conference
  still happening at UIC? Since the conference hotel isn't super close,
  does that mean there will be transportation provided?

 The entire conference and pre-conference is at UIC. The Forum is a
 revenue generating part of UIC. The pre-conference will be at the
 University Libraries on Monday with the exception of the Drupal one.

 The hotel is a mile or thereabouts from UIC Forum. Here is the problem
 with us natives planning. It never crossed our minds that walking a mile
 while on the *upper limit* of our shuttling to and from work is not the
 norm for everyone. This was brought to our attention and we will have a
 shuttle from the Hotel to the Conference venue.

 
  While we're on the subject, are the pre-conferences happening at the
  same location?


 See above.

 ./fxk

 
  On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
   On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
   Hi all,
  
   Apparently code4lib 2013 is going to be held at the UIC Forum
  
 http://www.uic.edu/depts/uicforum/
  
   I assumed it would be at the conference hotel. This is just a note so
   that others do not make the same assumption, since nowhere in the
   information about the conference is the location made clear.
  
   Since the conference hotel is 1 mile from the venue, I assume
   transportation will be available.
  
   That's a good assumption to make. As to the confusion  I said to you
   when you asked me about this a couple of days ago.
  
   http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
   proposal. If you look at the document it also suggests that we were
   going to have the conference registration staggered by timezones. We
   have elected not to update that because as that was our proposal. When
   preparing our proposal we borrowed heavily from Yale's and IU's
 proposal
   and if someone would like to steal from us I think it is fair to leave
   that as is.
  
   If you want the conference page use the lanyrd.com link below. I can't
   even take credit for doing that. All of that goes to @pberry
  
   http://lanyrd.com/2013/c4l13/
  
   Cheers,
   ./fxk
  
  
  
  
   best, Erik Hetzner
  
   Sent from my free software system http://fsf.org/.
  
  
  
  
   --
   Speed is subsittute fo accurancy.
 

 --
 Speed is subsittute fo accurancy.




-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Wilhelmina Randtke
It takes about 15 minutes to walk a mile.  It's really not that far for
people without health problems that affect mobility.  In most cases,
driving, then parking will take more time than walking to cover such a
short distance.  Just saying...

-Wilhelmina Randtke

On Fri, Jan 11, 2013 at 7:12 PM, Bill Dueber b...@dueber.com wrote:

 Because it seems like it might be useful, I've started a publicly-editable
 google map at

 http://goo.gl/maps/LWqay

 Right now, it has two points: the hotel and the conference location. Please
 add stuff as appropriate if the urge strikes you.




 On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:

  On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
   I'm sorry, but that doesn't actually clear up anything for me. The
   location on the layrd page just says Chicago. So, is the conference
   still happening at UIC? Since the conference hotel isn't super close,
   does that mean there will be transportation provided?
 
  The entire conference and pre-conference is at UIC. The Forum is a
  revenue generating part of UIC. The pre-conference will be at the
  University Libraries on Monday with the exception of the Drupal one.
 
  The hotel is a mile or thereabouts from UIC Forum. Here is the problem
  with us natives planning. It never crossed our minds that walking a mile
  while on the *upper limit* of our shuttling to and from work is not the
  norm for everyone. This was brought to our attention and we will have a
  shuttle from the Hotel to the Conference venue.
 
  
   While we're on the subject, are the pre-conferences happening at the
   same location?
 
 
  See above.
 
  ./fxk
 
  
   On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu
 wrote:
On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
Hi all,
   
Apparently code4lib 2013 is going to be held at the UIC Forum
   
  http://www.uic.edu/depts/uicforum/
   
I assumed it would be at the conference hotel. This is just a note
 so
that others do not make the same assumption, since nowhere in the
information about the conference is the location made clear.
   
Since the conference hotel is 1 mile from the venue, I assume
transportation will be available.
   
That's a good assumption to make. As to the confusion  I said to you
when you asked me about this a couple of days ago.
   
http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
proposal. If you look at the document it also suggests that we were
going to have the conference registration staggered by timezones. We
have elected not to update that because as that was our proposal.
 When
preparing our proposal we borrowed heavily from Yale's and IU's
  proposal
and if someone would like to steal from us I think it is fair to
 leave
that as is.
   
If you want the conference page use the lanyrd.com link below. I
 can't
even take credit for doing that. All of that goes to @pberry
   
http://lanyrd.com/2013/c4l13/
   
Cheers,
./fxk
   
   
   
   
best, Erik Hetzner
   
Sent from my free software system http://fsf.org/.
   
   
   
   
--
Speed is subsittute fo accurancy.
  
 
  --
  Speed is subsittute fo accurancy.
 



 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library



Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Cary Gordon
FWIW, the # 8 bus runs every 10 min.

Cary

On Fri, Jan 11, 2013 at 5:12 PM, Bill Dueber b...@dueber.com wrote:
 Because it seems like it might be useful, I've started a publicly-editable
 google map at

 http://goo.gl/maps/LWqay

 Right now, it has two points: the hotel and the conference location. Please
 add stuff as appropriate if the urge strikes you.




 On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:

 On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
  I'm sorry, but that doesn't actually clear up anything for me. The
  location on the layrd page just says Chicago. So, is the conference
  still happening at UIC? Since the conference hotel isn't super close,
  does that mean there will be transportation provided?

 The entire conference and pre-conference is at UIC. The Forum is a
 revenue generating part of UIC. The pre-conference will be at the
 University Libraries on Monday with the exception of the Drupal one.

 The hotel is a mile or thereabouts from UIC Forum. Here is the problem
 with us natives planning. It never crossed our minds that walking a mile
 while on the *upper limit* of our shuttling to and from work is not the
 norm for everyone. This was brought to our attention and we will have a
 shuttle from the Hotel to the Conference venue.

 
  While we're on the subject, are the pre-conferences happening at the
  same location?


 See above.

 ./fxk

 
  On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
   On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
   Hi all,
  
   Apparently code4lib 2013 is going to be held at the UIC Forum
  
 http://www.uic.edu/depts/uicforum/
  
   I assumed it would be at the conference hotel. This is just a note so
   that others do not make the same assumption, since nowhere in the
   information about the conference is the location made clear.
  
   Since the conference hotel is 1 mile from the venue, I assume
   transportation will be available.
  
   That's a good assumption to make. As to the confusion  I said to you
   when you asked me about this a couple of days ago.
  
   http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
   proposal. If you look at the document it also suggests that we were
   going to have the conference registration staggered by timezones. We
   have elected not to update that because as that was our proposal. When
   preparing our proposal we borrowed heavily from Yale's and IU's
 proposal
   and if someone would like to steal from us I think it is fair to leave
   that as is.
  
   If you want the conference page use the lanyrd.com link below. I can't
   even take credit for doing that. All of that goes to @pberry
  
   http://lanyrd.com/2013/c4l13/
  
   Cheers,
   ./fxk
  
  
  
  
   best, Erik Hetzner
  
   Sent from my free software system http://fsf.org/.
  
  
  
  
   --
   Speed is subsittute fo accurancy.
 

 --
 Speed is subsittute fo accurancy.




 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library



-- 
Cary Gordon
The Cherry Hill Company
http://chillco.com


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Francis Kayiwa
On Fri, Jan 11, 2013 at 05:51:17PM -0800, Cary Gordon wrote:
 FWIW, the # 8 bus runs every 10 min.

Good point. It may be worth your while getting the 3 day pass for $US 14

http://www.transitchicago.com/travel_information/fares/unlimitedridecards.aspx

Not for traveling to the conference but any other travel that you may
want to do while in town.

./fxk

 
 Cary
 
 On Fri, Jan 11, 2013 at 5:12 PM, Bill Dueber b...@dueber.com wrote:
  Because it seems like it might be useful, I've started a publicly-editable
  google map at
 
  http://goo.gl/maps/LWqay
 
  Right now, it has two points: the hotel and the conference location. Please
  add stuff as appropriate if the urge strikes you.
 
 
 
 
  On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:
 
  On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
   I'm sorry, but that doesn't actually clear up anything for me. The
   location on the layrd page just says Chicago. So, is the conference
   still happening at UIC? Since the conference hotel isn't super close,
   does that mean there will be transportation provided?
 
  The entire conference and pre-conference is at UIC. The Forum is a
  revenue generating part of UIC. The pre-conference will be at the
  University Libraries on Monday with the exception of the Drupal one.
 
  The hotel is a mile or thereabouts from UIC Forum. Here is the problem
  with us natives planning. It never crossed our minds that walking a mile
  while on the *upper limit* of our shuttling to and from work is not the
  norm for everyone. This was brought to our attention and we will have a
  shuttle from the Hotel to the Conference venue.
 
  
   While we're on the subject, are the pre-conferences happening at the
   same location?
 
 
  See above.
 
  ./fxk
 
  
   On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
Hi all,
   
Apparently code4lib 2013 is going to be held at the UIC Forum
   
  http://www.uic.edu/depts/uicforum/
   
I assumed it would be at the conference hotel. This is just a note so
that others do not make the same assumption, since nowhere in the
information about the conference is the location made clear.
   
Since the conference hotel is 1 mile from the venue, I assume
transportation will be available.
   
That's a good assumption to make. As to the confusion  I said to you
when you asked me about this a couple of days ago.
   
http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
proposal. If you look at the document it also suggests that we were
going to have the conference registration staggered by timezones. We
have elected not to update that because as that was our proposal. When
preparing our proposal we borrowed heavily from Yale's and IU's
  proposal
and if someone would like to steal from us I think it is fair to leave
that as is.
   
If you want the conference page use the lanyrd.com link below. I can't
even take credit for doing that. All of that goes to @pberry
   
http://lanyrd.com/2013/c4l13/
   
Cheers,
./fxk
   
   
   
   
best, Erik Hetzner
   
Sent from my free software system http://fsf.org/.
   
   
   
   
--
Speed is subsittute fo accurancy.
  
 
  --
  Speed is subsittute fo accurancy.
 
 
 
 
  --
  Bill Dueber
  Library Systems Programmer
  University of Michigan Library
 
 
 
 -- 
 Cary Gordon
 The Cherry Hill Company
 http://chillco.com
 

-- 
Speed is subsittute fo accurancy.


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Jon Gorman
Gah, I think I forgot to announce this on the list, but there's also
this google map:
https://maps.google.com/maps/ms?msid=213549257652679418473.0004ce6c25e6cdeb0319dmsa=0

which I put on the social page
http://wiki.code4lib.org/index.php/2013_social_activities

I'll go ahead and add the hotel and conference site to that as well if
it's not already there.

On Fri, Jan 11, 2013 at 7:12 PM, Bill Dueber b...@dueber.com wrote:
 Because it seems like it might be useful, I've started a publicly-editable
 google map at

 http://goo.gl/maps/LWqay

 Right now, it has two points: the hotel and the conference location. Please
 add stuff as appropriate if the urge strikes you.




 On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:

 On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
  I'm sorry, but that doesn't actually clear up anything for me. The
  location on the layrd page just says Chicago. So, is the conference
  still happening at UIC? Since the conference hotel isn't super close,
  does that mean there will be transportation provided?

 The entire conference and pre-conference is at UIC. The Forum is a
 revenue generating part of UIC. The pre-conference will be at the
 University Libraries on Monday with the exception of the Drupal one.

 The hotel is a mile or thereabouts from UIC Forum. Here is the problem
 with us natives planning. It never crossed our minds that walking a mile
 while on the *upper limit* of our shuttling to and from work is not the
 norm for everyone. This was brought to our attention and we will have a
 shuttle from the Hotel to the Conference venue.

 
  While we're on the subject, are the pre-conferences happening at the
  same location?


 See above.

 ./fxk

 
  On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
   On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
   Hi all,
  
   Apparently code4lib 2013 is going to be held at the UIC Forum
  
 http://www.uic.edu/depts/uicforum/
  
   I assumed it would be at the conference hotel. This is just a note so
   that others do not make the same assumption, since nowhere in the
   information about the conference is the location made clear.
  
   Since the conference hotel is 1 mile from the venue, I assume
   transportation will be available.
  
   That's a good assumption to make. As to the confusion  I said to you
   when you asked me about this a couple of days ago.
  
   http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
   proposal. If you look at the document it also suggests that we were
   going to have the conference registration staggered by timezones. We
   have elected not to update that because as that was our proposal. When
   preparing our proposal we borrowed heavily from Yale's and IU's
 proposal
   and if someone would like to steal from us I think it is fair to leave
   that as is.
  
   If you want the conference page use the lanyrd.com link below. I can't
   even take credit for doing that. All of that goes to @pberry
  
   http://lanyrd.com/2013/c4l13/
  
   Cheers,
   ./fxk
  
  
  
  
   best, Erik Hetzner
  
   Sent from my free software system http://fsf.org/.
  
  
  
  
   --
   Speed is subsittute fo accurancy.
 

 --
 Speed is subsittute fo accurancy.




 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library