Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-09-21 Thread emijrp
Hi all;

Just like the scripts to preserve wikis[1], I'm working in a new script to
download all Wikimedia Commons images packed by day. But I have limited
spare time. Sad that volunteers have to do this without any help from
Wikimedia Foundation.

I started too an effort in meta: (with low activity) to mirror XML dumps.[2]
If you know about universities or research groups which works with
Wiki[pm]edia XML dumps, they would be a possible successful target to mirror
them.

If you want to download the texts into your PC, you only need 100GB free and
to run this Python script.[3]

I heard that Internet Archive saves XML dumps quarterly or so, but no
official announcement. Also, I heard about Library of Congress wanting to
mirror the dumps, but not news since a long time.

L'Encyclopédie has an uptime[4] of 260 years[5] and growing. Will
Wiki[pm]edia projects reach that?

Regards,
emijrp

[1] http://code.google.com/p/wikiteam/
[2] http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps
[3]
http://code.google.com/p/wikiteam/source/browse/trunk/wikipediadownloader.py
[4] http://en.wikipedia.org/wiki/Uptime
[5] http://en.wikipedia.org/wiki/Encyclop%C3%A9die


2011/6/2 Fae fae...@gmail.com

 Hi,

 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

 If it exists, this would be a key factor for researchers choosing
 where to share their images with the public.

 Thanks,
 Fae
 --
 http://enwp.org/user_talk:fae
 Guide to email tags: http://j.mp/faetags

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-09-20 Thread Kim Bruning
On Thu, Jun 02, 2011 at 02:24:37PM +0200, Ziko van Dijk wrote:
 Hello Fae,
 
 There should be no explicit statement because the WMF holds it
 self-evident to preserve. 

That reminds me of something O:-)

Perhaps something like this?

We, the wikimedia movement, hold these truths to be self evident:
* That neutrality is the path to knowledge
* That all knowledge should be available to all people no matter when or 
wherever they want it, and be free to study, free to share, free to improve
* And that this state of affairs should hold in perpetuity, so that our 
children, and their children and etc. can benefit.

Might need some editing to make it perfect. Meta someplace?

sincerely,
Kim Bruning

-- 

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-03 Thread Ryan Kaldari
We already have a policy covering data preservation and recovery under 
any foreseeable disaster scenarios:
http://en.wikipedia.org/wiki/WP:TERMINAL

;)

Ryan Kaldari

On 6/2/11 4:44 PM, Mark Wagner wrote:
 On Thu, Jun 2, 2011 at 16:11, Neil Harrisn...@tonal.clara.co.uk  wrote:
 Tape is -- still -- your friend here. Flip the write-protect after
 writing, have two sets of off-site tapes, one copy of each in each of
 two secure and widely separated off-site locations run by two different
 organizations, and you're sorted.
 The mechanics of the backup are largely irrelevant.  What matters are
 the *policies*: what data do you back up, when do you back it up, how
 often do you test your backups, and so on.  Once you've got that
 sorted out, it doesn't really matter whether you're storing the
 backups on tape, remote servers, or magic pixie dust.


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-03 Thread George Herbert
On Thu, Jun 2, 2011 at 5:17 PM, Neil Harris n...@tonal.clara.co.uk wrote:
 On 03/06/11 00:44, Mark Wagner wrote:
 On Thu, Jun 2, 2011 at 16:11, Neil Harrisn...@tonal.clara.co.uk  wrote:
 Tape is -- still -- your friend here. Flip the write-protect after
 writing, have two sets of off-site tapes, one copy of each in each of
 two secure and widely separated off-site locations run by two different
 organizations, and you're sorted.
 The mechanics of the backup are largely irrelevant.  What matters are
 the *policies*: what data do you back up, when do you back it up, how
 often do you test your backups, and so on.  Once you've got that
 sorted out, it doesn't really matter whether you're storing the
 backups on tape, remote servers, or magic pixie dust.


 Not quite.

 You're right about procedures, but you can't begin defining procedures
 until you have something concrete to aim at.

 Tape is the One True Way for large scale backup, even today (ask
 Google), and I thought it might be useful to give an illustration of
 just how cheap it would be to use. Tape is a great simplifier, and
 eliminates a lot of the fanciness and feature-bloat associated with more
 sophisticated systems -- more sophisticated is not necessarily better.

I have done large enterprise scale backup (not Google-scale, but there
really isn't anyone else at Google's scale...) entirely without tape,
just using nearline disk.  These days it's in fact not unreasonable to
do it that way.  Offsiting the backups via networks versus physical
tape moves are pretty much equivalent here.

That is neither here nor there to the policy question, however.

I think this is an area that I, as a technical domain expert, wish I
knew more about the WMF operations staff detailed implementation and
plans here; but the staff are competent folks and I don't know of any
actual gaps from reasonable industry practice.

If the community is sufficiently concerned that there may be a gap,
then the board should perhaps either request staff to be more open, or
get an independent consultant in to review if operational details are
thought to be sensitive.


-- 
-george william herbert
george.herb...@gmail.com

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-03 Thread Fae
Hi,

As a follow-up to my original question, my brief presentation today at
the all day Wellcome Trust research images workshop went down well and
everyone was happy to see in perpetuity as a commitment. Thanks for
the comments made in this thread, they did influence the nature of my
discussions.

In practice if WM-UK are able to partner with either the Wellcome
Library (I'm hopeful after my behind the scenes chat with their Head
of Publishing) or some of the sponsored research projects with large
image assets, the WMF may find it usefully pre-emptive to go the next
step and be able to show how this is planned for in operation practice
if it has yet to be detailed (i.e. not just a backup procedure but
strategically planning for perpetuity).

Cheers,
Fae
--
http://enwp.org/user_talk:fae
Guide to email tags: http://j.mp/faetags

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


[Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fae
Hi,

I'm taking part in an images discussion workshop with a number of
academics tomorrow and could do with a statement about the WMF's long
term commitment to supporting Wikimedia Commons (and other projects)
in terms of the public availability of media. Is there an official
published policy I can point to that includes, say, a 10 year or 100
commitment?

If it exists, this would be a key factor for researchers choosing
where to share their images with the public.

Thanks,
Fae
--
http://enwp.org/user_talk:fae
Guide to email tags: http://j.mp/faetags

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Gerard Meijssen
Hoi,
It is the explicit goal of the Wikimedia Foundation to make information
available for as long as it exist. In addition to this, there are several
copies at the Internet Archive.

If there is no statement that satisfies your need, it will not be hard for
the WMF board to come up with one. Having such a statement by tomorrow is a
bit much to ask for.
Thanks,
  GerardM

On 2 June 2011 13:29, Fae fae...@gmail.com wrote:

 Hi,

 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

 If it exists, this would be a key factor for researchers choosing
 where to share their images with the public.

 Thanks,
 Fae
 --
 http://enwp.org/user_talk:fae
 Guide to email tags: http://j.mp/faetags

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Ziko van Dijk
Hello Fae,

There should be no explicit statement because the WMF holds it
self-evident to preserve. The bigger problem might be the project
scope. I don't know what kind of images your academic partners wishes
to upload.

Kind regards
Ziko



2011/6/2 Gerard Meijssen gerard.meijs...@gmail.com:
 Hoi,
 It is the explicit goal of the Wikimedia Foundation to make information
 available for as long as it exist. In addition to this, there are several
 copies at the Internet Archive.

 If there is no statement that satisfies your need, it will not be hard for
 the WMF board to come up with one. Having such a statement by tomorrow is a
 bit much to ask for.
 Thanks,
      GerardM

 On 2 June 2011 13:29, Fae fae...@gmail.com wrote:

 Hi,

 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

 If it exists, this would be a key factor for researchers choosing
 where to share their images with the public.

 Thanks,
 Fae
 --
 http://enwp.org/user_talk:fae
 Guide to email tags: http://j.mp/faetags

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




-- 
Ziko van Dijk
The Netherlands
http://zikoblog.wordpress.com/

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread David Gerard
On 2 June 2011 13:24, Ziko van Dijk zvand...@googlemail.com wrote:

 There should be no explicit statement because the WMF holds it
 self-evident to preserve. The bigger problem might be the project
 scope. I don't know what kind of images your academic partners wishes
 to upload.


There's also the matter that no particular image can be promised to be
safe from local deletion processes.

(except, e.g., WMF logos etc. But even then there's a perennial policy
suggestion to move those to a non-free repo not on Commons instead.)


- d.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Federico Leva (Nemo)
Fae, 02/06/2011 13:29:
 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

The only thing I can remember is 
http://en.wikipedia.org/wiki/Wikipedia:Ten_things_you_may_not_know_about_Wikipedia#You_can.27t_actually_change_anything_in_Wikipedia.E2.80.A6

Nemo

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fae
Briefly responding to a couple of points raised so far:

Yes, there is a need for a policy as otherwise the WMF would have no
long term operational archive plan. Self evident is insufficient in
order to budget and plan in a credible way. If as the planned outcome
of a research project I had a large image donation to make and such a
commitment was absent, I would prefer to mass donate images of public
interest to an organization that had one, and assume that at some
point e-volunteers at Wikimedia Commons would take the initiative and
port in what they fancied.

The people I'm workshopping with tomorrow have research roles within a
number of leading universities along with a number of research
organizations under the umbrella of the Wellcome Trust (the largest
charity in the UK) and a variety of semi-associated organizations such
as Cancer Research UK, Open Research Computation, Bioinformatics
Training Network and FlyBase. All these folks have large image assets
to discuss and are keen to move forward with an open solution to
recommend on their personal networks for the long, long term public
good.

I appreciate the image deletion issue, what we are talking about here
are planned batch uploads of high quality donations. Part of that
planning would be to discuss the relevance to the public of large
number of research images and compliance with existing Commons
guidelines. There may well be cases, for example many thousands of
similar images of mutant drosophila, where Wikimedia Commons is not
the right place for a full donation and a more specialized database
host is needed.

Cheers,
Fae
--
http://enwp.org/user_talk:fae
Guide to email tags: http://j.mp/faetags

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fred Bauder
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan. Self evident is insufficient in
 order to budget and plan in a credible way. If as the planned outcome
 of a research project I had a large image donation to make and such a
 commitment was absent, I would prefer to mass donate images of public
 interest to an organization that had one, and assume that at some
 point e-volunteers at Wikimedia Commons would take the initiative and
 port in what they fancied.

 The people I'm workshopping with tomorrow have research roles within a
 number of leading universities along with a number of research
 organizations under the umbrella of the Wellcome Trust (the largest
 charity in the UK) and a variety of semi-associated organizations such
 as Cancer Research UK, Open Research Computation, Bioinformatics
 Training Network and FlyBase. All these folks have large image assets
 to discuss and are keen to move forward with an open solution to
 recommend on their personal networks for the long, long term public
 good.

 I appreciate the image deletion issue, what we are talking about here
 are planned batch uploads of high quality donations. Part of that
 planning would be to discuss the relevance to the public of large
 number of research images and compliance with existing Commons
 guidelines. There may well be cases, for example many thousands of
 similar images of mutant drosophila, where Wikimedia Commons is not
 the right place for a full donation and a more specialized database
 host is needed.

 Cheers,
 Fae
 --
 http://enwp.org/user_talk:fae
 Guide to email tags: http://j.mp/faetags

Compared to many institutions, undoubtedly including some of those you
will be communicating with, the Wikimedia Foundation has very limited
assets and little or no endowment. And, of course, essentially no staff
other than our volunteers.

I think what needs to happen is to explore ways to cooperate using each
institutions relative assets. That might include, for example, endowing
Commons with assets sufficient to support long term archival services as
well as a corporate commitment on the Foundation's part to fulfill such
services on an institutional basis, read centuries...

I'm sure there are other ways the Foundation could cooperate for public
benefit and other partners who could participate in such consortiums. The
threshold requirement is a commitment to accessible free public access
under a fully featured open source license.

Fred



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread David Gerard
On 2 June 2011 15:19, Fred Bauder fredb...@fairpoint.net wrote:

 I think what needs to happen is to explore ways to cooperate using each
 institutions relative assets. That might include, for example, endowing
 Commons with assets sufficient to support long term archival services as
 well as a corporate commitment on the Foundation's part to fulfill such
 services on an institutional basis, read centuries...


One important plus point for WMF is that unlike, e.g. Flickr, we are
not subject to corporate whims. This means that our supply of a
service is not contingent on it turning a profit.


- d.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Milos Rancic
On 06/02/2011 02:27 PM, David Gerard wrote:
 On 2 June 2011 13:24, Ziko van Dijk zvand...@googlemail.com wrote:
 There should be no explicit statement because the WMF holds it
 self-evident to preserve. The bigger problem might be the project
 scope. I don't know what kind of images your academic partners wishes
 to upload.
 
 There's also the matter that no particular image can be promised to be
 safe from local deletion processes.
 
 (except, e.g., WMF logos etc. But even then there's a perennial policy
 suggestion to move those to a non-free repo not on Commons instead.)

Full version could be always kept at archive.org.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread FT2
There are two caveats:  nobody can tell the  future of human cultural
history or any individual legal organization, and while the repository and
wikis as a whole, and virtually all legally hostable media of genuine value,
are preserved indefinitely, obviously no guarantee can be given concerning
any specific individual image or article.

Beyond that, the best guarantee is the license it's under.  The Foundation
licenses all its data and content (with the sole exception of non-free
images used to illustrate articles on local wikis) under a license that
allows anyone to use, copy, amend, or distribute them. The explicit purpose
of doing so is so that anyone wishing to can not only redistribute it, but
if they are unhappy with its prospects in WMF's custodianship, they can take
all of it and archive it or fork from it - that is, start their own version
based on all content, descriptions, data and articles they wish to take and
use.

That right is enshrined on Wikipedia in policy and license - it's known as
the *right to fork* [ie, to create derivatives and copies].   Our forking
FAQ http://en.wikipedia.org/wiki/Wikipedia:FAQ/Forking expands on this
giving details of where data can be downloaded, as well as Wikipedia holding
a list of websites that mirror its
contenthttp://en.wikipedia.org/wiki/Category:Websites_which_use_Wikipediafor
anyone's use.

As the financial market crash proved, promises made by one organization are
only useful insofar as that organization can promise to endure and meet
them. Our approach is to spread our content and make sure others know we
actively support re-archiving and reuse of it, ensuring that copies and
archives will always exist.

At worst I cannot be sure if all data is routinely provided - a staff member
can comment on this - but the policy, rights, traditions, choice of license,
and endorsement of other sites doing so in practice, is our way of ensuring
a practical commitment is made.

FT2


On Thu, Jun 2, 2011 at 12:29 PM, Fae fae...@gmail.com wrote:

 Hi,

 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

 If it exists, this would be a key factor for researchers choosing
 where to share their images with the public.

 Thanks,
 Fae

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fred Bauder
OH, I see: Don't put your eggs all in one basket.

Fred

 There are two caveats:  nobody can tell the  future of human cultural
 history or any individual legal organization, and while the repository
 and
 wikis as a whole, and virtually all legally hostable media of genuine
 value,
 are preserved indefinitely, obviously no guarantee can be given
 concerning
 any specific individual image or article.

 Beyond that, the best guarantee is the license it's under.  The
 Foundation
 licenses all its data and content (with the sole exception of non-free
 images used to illustrate articles on local wikis) under a license that
 allows anyone to use, copy, amend, or distribute them. The explicit
 purpose
 of doing so is so that anyone wishing to can not only redistribute it,
 but
 if they are unhappy with its prospects in WMF's custodianship, they can
 take
 all of it and archive it or fork from it - that is, start their own
 version
 based on all content, descriptions, data and articles they wish to take
 and
 use.

 That right is enshrined on Wikipedia in policy and license - it's known
 as
 the *right to fork* [ie, to create derivatives and copies].   Our
 forking
 FAQ http://en.wikipedia.org/wiki/Wikipedia:FAQ/Forking expands on this
 giving details of where data can be downloaded, as well as Wikipedia
 holding
 a list of websites that mirror its
 contenthttp://en.wikipedia.org/wiki/Category:Websites_which_use_Wikipediafor
 anyone's use.

 As the financial market crash proved, promises made by one organization
 are
 only useful insofar as that organization can promise to endure and meet
 them. Our approach is to spread our content and make sure others know we
 actively support re-archiving and reuse of it, ensuring that copies and
 archives will always exist.

 At worst I cannot be sure if all data is routinely provided - a staff
 member
 can comment on this - but the policy, rights, traditions, choice of
 license,
 and endorsement of other sites doing so in practice, is our way of
 ensuring
 a practical commitment is made.

 FT2


 On Thu, Jun 2, 2011 at 12:29 PM, Fae fae...@gmail.com wrote:

 Hi,

 I'm taking part in an images discussion workshop with a number of
 academics tomorrow and could do with a statement about the WMF's long
 term commitment to supporting Wikimedia Commons (and other projects)
 in terms of the public availability of media. Is there an official
 published policy I can point to that includes, say, a 10 year or 100
 commitment?

 If it exists, this would be a key factor for researchers choosing
 where to share their images with the public.

 Thanks,
 Fae

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fred Bauder
 OH, I see: Don't put your eggs all in one basket.

 Fred

Actually, that is the benefit of giving Commons access to archives of
images. Access by Commons under its license conventions gives access to
everyone and results in all interesting images going viral.

There are images of little or no value, as anyone who has viewed an old
photo album of family photos that has become disassociated from its
family knows.

Fred


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Thomas Dalton
On 2 June 2011 14:21, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan.

Why would we have an archive plan? Archives are for things that aren't
expected to needed on a regular basis any more but may need to be
referred to in the future. We're not going to archive things on
Commons, they'll just stay on Commons indefinitely.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fae
Sure Tom, here's a SciFi user story:

In 2016 San Francisco has a major earthquake and the servers and
operational facilities for the WMF are damaged beyond repair. The
emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
attack from Eastern European countries. The switchover eventually
appears successful and data is synchronized with Hong Kong for the
next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
complaints about images disappearing, it is realized that this is a
result of local data caches expiring. The DoS attack covered the
tracks of a passive data worm that only activates during back-up
cycles and the loss is irrecoverable due backups aged over 2 weeks
being automatically deleted. Due to no archive strategy it is
estimated that the majority of digital assets have been permanently
lost and estimates for 60% partial reconstruction from remaining cache
snapshots and independent global archive sites run to over 2 years of
work.

Cheers,
Fae
--
http://enwp.org/user_talk:fae
Guide to email tags: http://j.mp/faetags

On 2 June 2011 18:27, Thomas Dalton thomas.dal...@gmail.com wrote:
 On 2 June 2011 14:21, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan.

 Why would we have an archive plan? Archives are for things that aren't
 expected to needed on a regular basis any more but may need to be
 referred to in the future. We're not going to archive things on
 Commons, they'll just stay on Commons indefinitely.

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Thomas Dalton
On 2 June 2011 18:48, Fae fae...@gmail.com wrote:
 Sure Tom, here's a SciFi user story:

 In 2016 San Francisco has a major earthquake and the servers and
 operational facilities for the WMF are damaged beyond repair. The
 emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
 attack from Eastern European countries. The switchover eventually
 appears successful and data is synchronized with Hong Kong for the
 next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
 complaints about images disappearing, it is realized that this is a
 result of local data caches expiring. The DoS attack covered the
 tracks of a passive data worm that only activates during back-up
 cycles and the loss is irrecoverable due backups aged over 2 weeks
 being automatically deleted. Due to no archive strategy it is
 estimated that the majority of digital assets have been permanently
 lost and estimates for 60% partial reconstruction from remaining cache
 snapshots and independent global archive sites run to over 2 years of
 work.

Ah, you don't mean archive. You mean backup. They are very
different things and serve very different purposes. The backing up of
images is an issue. The text exists in loads of places, but there is a
risk of losing the images. I know it has been discussed numerous times
being, so hopefully the WMF is working on it (or may have recently put
something in place that I'm not aware of).

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread David Gerard
On 2 June 2011 18:48, Fae fae...@gmail.com wrote:

 In 2016 San Francisco has a major earthquake and the servers and
 operational facilities for the WMF are damaged beyond repair. The
 emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
 attack from Eastern European countries. The switchover eventually
 appears successful and data is synchronized with Hong Kong for the
 next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
 complaints about images disappearing, it is realized that this is a
 result of local data caches expiring. The DoS attack covered the
 tracks of a passive data worm that only activates during back-up
 cycles and the loss is irrecoverable due backups aged over 2 weeks
 being automatically deleted. Due to no archive strategy it is
 estimated that the majority of digital assets have been permanently
 lost and estimates for 60% partial reconstruction from remaining cache
 snapshots and independent global archive sites run to over 2 years of
 work.


This sort of scenario is why some of us have a thing about the backups :-)

(Is there a good image backup of Commons and of the larger wikis, and
- and this one may be trickier - has anyone ever downloaded said
backups?)


- d.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fred Bauder
 On 2 June 2011 14:21, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan.

 Why would we have an archive plan? Archives are for things that aren't
 expected to needed on a regular basis any more but may need to be
 referred to in the future. We're not going to archive things on
 Commons, they'll just stay on Commons indefinitely.

If an image is hosted on Commons for 100 years and NEVER used by any
other Wikimedia project would we, or why should we, retain it?

Fred



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Wjhonson
Because Commons is to be used by the world, not just sister projects.
If the New York Times Online links a picture in from Commons (and credits it 
properly) are we going to make their later-historical story useless by deleting 
the picture ?

 

 


 

 

-Original Message-
From: Fred Bauder fredb...@fairpoint.net
To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org
Sent: Thu, Jun 2, 2011 11:01 am
Subject: Re: [Foundation-l] Request: WMF commitment as a long term cultural 
archive?


 On 2 June 2011 14:21, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan.

 Why would we have an archive plan? Archives are for things that aren't
 expected to needed on a regular basis any more but may need to be
 referred to in the future. We're not going to archive things on
 Commons, they'll just stay on Commons indefinitely.

If an image is hosted on Commons for 100 years and NEVER used by any
other Wikimedia project would we, or why should we, retain it?

Fred



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

 
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread phoebe ayers
On Thu, Jun 2, 2011 at 6:21 AM, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan. Self evident is insufficient in
 order to budget and plan in a credible way. If as the planned outcome
 of a research project I had a large image donation to make and such a
 commitment was absent, I would prefer to mass donate images of public
 interest to an organization that had one, and assume that at some
 point e-volunteers at Wikimedia Commons would take the initiative and
 port in what they fancied.

Fae,

There is no explicit, official operational archive plan of the type
you are referring to. I am familiar with the type of plan you mean --
archives and libraries in particular often have explicit retention
plans that specify a date range. This kind of plan would likely be
developed by the board as part of our long-range operational planning.
There are difficulties, as others have pointed out, because unlike an
archive we cannot guarantee retention of any particular item --
individual curation and editorial decisions are done by the community.

However, long-term preservation and dissemination of knowledge is an
inherent and explicit part of our mission. You could point to:
* Our mission statement, which says we will retain useful information
from our projects on the Internet, free of charge, in perpetuity
(http://wikimediafoundation.org/wiki/Mission_statement)
* the fact that a free license enables redistribution and longer-term
preservation support than copyright does, because others have the
ability to preserve our collections even if the WMF itself fails
(dumps are noted as a value, in our values statement:
http://wikimediafoundation.org/wiki/Values).

As you note, the key part of this is free licensing under a compatible
license. We are interested in supporting the ecosystem of free
knowledge, so that if an organization wanted their primary archive to
be someplace else (but accessible to Commons technically and through
licensing) that's fine; we can upload. However, as an organization, we
are absolutely committed to preserving free knowledge for the long
term.

For this presentation, your preparation turn-around time is pretty
short here, and I personally don't have time to pull together other
community documents on this subject right now (maybe others do), but
you can certainly tell the organizations our about our long-term
commitment. Whether Commons is appropriate for them, however, depends
on what they are looking for. The biggest argument for uploading
collections to Wikimedia is not our function as an archival service
(since we don't fulfill all of the requirements of a traditional
archive), but rather the immense distribution and visibility our
projects can give such collections, far exceeding any other online
service, because of our global reach.

best,
Phoebe (speaking as a member of the Board)

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread teun spaans
A lot of questions here.
IF an image is hosted and not used for in 100 years, it would be up to the
people in 100 years to decide. Any guarantee we try to make for such periods
is absolutely useless. Every rule we make can be re-discussed and changed in
such a period.

If an organization such as the NYT uses an image from commons by inline
linking, then we could indeed invalidate their historical research by
deleting that image if it contains a copyright violation. CV are the main
reason for deletion. Other reasons include bad quality, and duplicate
images.

Teun Spaans
Everybody knew it was impossible, until someone turned up who didnt know
that


On Thu, Jun 2, 2011 at 8:14 PM, Wjhonson wjhon...@aol.com wrote:

 Because Commons is to be used by the world, not just sister projects.
 If the New York Times Online links a picture in from Commons (and credits
 it properly) are we going to make their later-historical story useless by
 deleting the picture ?










 -Original Message-
 From: Fred Bauder fredb...@fairpoint.net
 To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org
 Sent: Thu, Jun 2, 2011 11:01 am
 Subject: Re: [Foundation-l] Request: WMF commitment as a long term cultural
 archive?


  On 2 June 2011 14:21, Fae fae...@gmail.com wrote:
  Briefly responding to a couple of points raised so far:
 
  Yes, there is a need for a policy as otherwise the WMF would have no
  long term operational archive plan.
 
  Why would we have an archive plan? Archives are for things that aren't
  expected to needed on a regular basis any more but may need to be
  referred to in the future. We're not going to archive things on
  Commons, they'll just stay on Commons indefinitely.

 If an image is hosted on Commons for 100 years and NEVER used by any
 other Wikimedia project would we, or why should we, retain it?

 Fred



 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Fae
Thanks Phoebe, for my presentation I'll highlight long term
preservation in perpetuity as the key point of interest and reflect
some of the other issues raised on this thread about the suitability
for certain types of donation.

I'm not expecting a WMF policy overnight, just thought that there
might be something in existence. It sounds like an area of the mission
that would be reasonable to translate into direct operational targets
(say, a pragmatic 10 or 20 year plan).

Cheers,
Fae
--
http://enwp.org/user_talk:fae
Guide to email tags: http://j.mp/faetags



On 2 June 2011 19:28, phoebe ayers phoebe.w...@gmail.com wrote:
 On Thu, Jun 2, 2011 at 6:21 AM, Fae fae...@gmail.com wrote:
 Briefly responding to a couple of points raised so far:

 Yes, there is a need for a policy as otherwise the WMF would have no
 long term operational archive plan. Self evident is insufficient in
 order to budget and plan in a credible way. If as the planned outcome
 of a research project I had a large image donation to make and such a
 commitment was absent, I would prefer to mass donate images of public
 interest to an organization that had one, and assume that at some
 point e-volunteers at Wikimedia Commons would take the initiative and
 port in what they fancied.

 Fae,

 There is no explicit, official operational archive plan of the type
 you are referring to. I am familiar with the type of plan you mean --
 archives and libraries in particular often have explicit retention
 plans that specify a date range. This kind of plan would likely be
 developed by the board as part of our long-range operational planning.
 There are difficulties, as others have pointed out, because unlike an
 archive we cannot guarantee retention of any particular item --
 individual curation and editorial decisions are done by the community.

 However, long-term preservation and dissemination of knowledge is an
 inherent and explicit part of our mission. You could point to:
 * Our mission statement, which says we will retain useful information
 from our projects on the Internet, free of charge, in perpetuity
 (http://wikimediafoundation.org/wiki/Mission_statement)
 * the fact that a free license enables redistribution and longer-term
 preservation support than copyright does, because others have the
 ability to preserve our collections even if the WMF itself fails
 (dumps are noted as a value, in our values statement:
 http://wikimediafoundation.org/wiki/Values).

 As you note, the key part of this is free licensing under a compatible
 license. We are interested in supporting the ecosystem of free
 knowledge, so that if an organization wanted their primary archive to
 be someplace else (but accessible to Commons technically and through
 licensing) that's fine; we can upload. However, as an organization, we
 are absolutely committed to preserving free knowledge for the long
 term.

 For this presentation, your preparation turn-around time is pretty
 short here, and I personally don't have time to pull together other
 community documents on this subject right now (maybe others do), but
 you can certainly tell the organizations our about our long-term
 commitment. Whether Commons is appropriate for them, however, depends
 on what they are looking for. The biggest argument for uploading
 collections to Wikimedia is not our function as an archival service
 (since we don't fulfill all of the requirements of a traditional
 archive), but rather the immense distribution and visibility our
 projects can give such collections, far exceeding any other online
 service, because of our global reach.

 best,
 Phoebe (speaking as a member of the Board)

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread George Herbert
On Thu, Jun 2, 2011 at 10:55 AM, David Gerard dger...@gmail.com wrote:
 On 2 June 2011 18:48, Fae fae...@gmail.com wrote:

 In 2016 San Francisco has a major earthquake and the servers and
 operational facilities for the WMF are damaged beyond repair. The
 emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
 attack from Eastern European countries. The switchover eventually
 appears successful and data is synchronized with Hong Kong for the
 next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
 complaints about images disappearing, it is realized that this is a
 result of local data caches expiring. The DoS attack covered the
 tracks of a passive data worm that only activates during back-up
 cycles and the loss is irrecoverable due backups aged over 2 weeks
 being automatically deleted. Due to no archive strategy it is
 estimated that the majority of digital assets have been permanently
 lost and estimates for 60% partial reconstruction from remaining cache
 snapshots and independent global archive sites run to over 2 years of
 work.


 This sort of scenario is why some of us have a thing about the backups :-)

 (Is there a good image backup of Commons and of the larger wikis, and
 - and this one may be trickier - has anyone ever downloaded said
 backups?)


 - d.

I've floated this to Erik a couple of times, but if the Foundation
would like an IT disaster response / business continuity audit, I can
do those.


-- 
-george william herbert
george.herb...@gmail.com

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread phoebe ayers
On Thu, Jun 2, 2011 at 11:52 AM, George Herbert
george.herb...@gmail.com wrote:
 On Thu, Jun 2, 2011 at 10:55 AM, David Gerard dger...@gmail.com wrote:
 On 2 June 2011 18:48, Fae fae...@gmail.com wrote:

 In 2016 San Francisco has a major earthquake and the servers and
 operational facilities for the WMF are damaged beyond repair. The
 emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
 attack from Eastern European countries. The switchover eventually
 appears successful and data is synchronized with Hong Kong for the
 next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
 complaints about images disappearing, it is realized that this is a
 result of local data caches expiring. The DoS attack covered the
 tracks of a passive data worm that only activates during back-up
 cycles and the loss is irrecoverable due backups aged over 2 weeks
 being automatically deleted. Due to no archive strategy it is
 estimated that the majority of digital assets have been permanently
 lost and estimates for 60% partial reconstruction from remaining cache
 snapshots and independent global archive sites run to over 2 years of
 work.


 This sort of scenario is why some of us have a thing about the backups :-)

 (Is there a good image backup of Commons and of the larger wikis, and
 - and this one may be trickier - has anyone ever downloaded said
 backups?)


 - d.

 I've floated this to Erik a couple of times, but if the Foundation
 would like an IT disaster response / business continuity audit, I can
 do those.

Right, when Fae asked her question I was thinking of the more
philosophical type of planning for storage that archives often do (as
a matter of course we retain documents for 10 years, or in perpetuity,
or whatever); but disaster and backup planning are also relevant.
That's documented as a part of technical operations rather than as
board-level policies; I think we're all on the same page about caring
about this issue though. It is also relevant that the WMF is a
financially stable non-profit, and thus unlikely to go out of business
through the vagaries of the market.

-- phoebe

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Neil Harris
On 02/06/11 19:52, George Herbert wrote:
 On Thu, Jun 2, 2011 at 10:55 AM, David Gerarddger...@gmail.com  wrote:
 On 2 June 2011 18:48, Faefae...@gmail.com  wrote:

 In 2016 San Francisco has a major earthquake and the servers and
 operational facilities for the WMF are damaged beyond repair. The
 emergency hot switchover to Hong Kong is delayed due to an ongoing DoS
 attack from Eastern European countries. The switchover eventually
 appears successful and data is synchronized with Hong Kong for the
 next 3 weeks. At the end of 3 weeks, with a massive raft of escalating
 complaints about images disappearing, it is realized that this is a
 result of local data caches expiring. The DoS attack covered the
 tracks of a passive data worm that only activates during back-up
 cycles and the loss is irrecoverable due backups aged over 2 weeks
 being automatically deleted. Due to no archive strategy it is
 estimated that the majority of digital assets have been permanently
 lost and estimates for 60% partial reconstruction from remaining cache
 snapshots and independent global archive sites run to over 2 years of
 work.

 This sort of scenario is why some of us have a thing about the backups :-)

 (Is there a good image backup of Commons and of the larger wikis, and
 - and this one may be trickier - has anyone ever downloaded said
 backups?)


 - d.
 I've floated this to Erik a couple of times, but if the Foundation
 would like an IT disaster response / business continuity audit, I can
 do those.



Tape is -- still -- your friend here. Flip the write-protect after 
writing, have two sets of off-site tapes, one copy of each in each of 
two secure and widely separated off-site locations run by two different 
organizations, and you're sorted.

Tape is the dumb backstop that will keep the data even when your 
supposedly infallible replicated and redundant systems fail. For 
example, it got Google out of a hole quite recently when they had to 
restore a significant number of Gmail accounts from tape. (see 
http://www.talkincloud.com/the-solution-to-the-gmail-glitch-tape-backup/ )

And, unlike other long-term storage media, there is a long history of 
tape storage, an understanding of its practical lifespan and risks, and 
well-understood procedures for making and verifying duplicate sub-master 
copies to new tape technologies over time to extend archive life, etc. etc.

If we say that Wikimedia Commons currently has ~10M images, and if allow 
1Mbyte per image, that's only 10 TB: that will fit nicely on seven LTO5 
tapes.   If you use LTFS, you can also make data access and long-term 
data robustness easier. If you like, you can slip in a complete dump of 
the Mediawiki source and Commons database on each tape, as well.

Even if I'm wrong by an order of magnitude, and 140 tapes are needed, 
instead of 14, that's still less than $10k of media -- and I wouldn't be 
surprised if tape storage companies wouldn't be eager to vie to be the 
company that can claim it donates the media and drives which provide 
Wikipedia's long-term backup system.

With two tape drives being run at once at an optimal 140 MB/s each, the 
whole backup would take less than a day. Even if I was wrong about both 
the writing speed and archive size by an order of magnitude each, this 
would still be less than three months.

The same tape systems could also, trivally, be used to back up all the 
other WMF sites, on similar lines.

-- Neil


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Mark Wagner
On Thu, Jun 2, 2011 at 16:11, Neil Harris n...@tonal.clara.co.uk wrote:

 Tape is -- still -- your friend here. Flip the write-protect after
 writing, have two sets of off-site tapes, one copy of each in each of
 two secure and widely separated off-site locations run by two different
 organizations, and you're sorted.

The mechanics of the backup are largely irrelevant.  What matters are
the *policies*: what data do you back up, when do you back it up, how
often do you test your backups, and so on.  Once you've got that
sorted out, it doesn't really matter whether you're storing the
backups on tape, remote servers, or magic pixie dust.

-- 
Mark Wagner

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Request: WMF commitment as a long term cultural archive?

2011-06-02 Thread Neil Harris
On 03/06/11 00:44, Mark Wagner wrote:
 On Thu, Jun 2, 2011 at 16:11, Neil Harrisn...@tonal.clara.co.uk  wrote:
 Tape is -- still -- your friend here. Flip the write-protect after
 writing, have two sets of off-site tapes, one copy of each in each of
 two secure and widely separated off-site locations run by two different
 organizations, and you're sorted.
 The mechanics of the backup are largely irrelevant.  What matters are
 the *policies*: what data do you back up, when do you back it up, how
 often do you test your backups, and so on.  Once you've got that
 sorted out, it doesn't really matter whether you're storing the
 backups on tape, remote servers, or magic pixie dust.


Not quite.

You're right about procedures, but you can't begin defining procedures 
until you have something concrete to aim at.

Tape is the One True Way for large scale backup, even today (ask 
Google), and I thought it might be useful to give an illustration of 
just how cheap it would be to use. Tape is a great simplifier, and 
eliminates a lot of the fanciness and feature-bloat associated with more 
sophisticated systems -- more sophisticated is not necessarily better.

Here's a straw man proposal for procedures:

I'd suggest backing up _everything_ -- cluster servers, local office IT 
servers, staff PCs, the lot -- for WMF internal archive and disaster 
recovery purposes. Something like monthly incremental backups, filing 
away media to the remote sites after verification, and yearly or 
six-monthly total backups to a complete new set of fresh media. For only 
a month's worth of work, replicated disk copies is fine: the tape 
archive is a back-stop, for when the replicated disks fail.

Dumps for external archives could also be made using the same drives, 
but to different media, and with a much more restrictive policy about 
what is saved.

-- Neil


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l