Why not just upload it to proteindiffraction.org ? Or the SBGrid data
bank (https://data.sbgrid.org/) ? Or both for "redundancy" ?
Yes, I did once do some calculations on what it would take to preserve
data for tens of thousands of years, and the only proven storage medium
for that timescale is clay tablets. Assuming 1 mm^3 is all you need to
store one bit it comes to about $3000/GB.
Hard drives, however, are now down to $33/TB, which is comparable to a
box of pipette tips, and takes up less space. LTO-6 tapes are $3/TB.
So the cost of storage I don't think is any real burden, its the cost of
managing that storage. If you buy a box of 12 TB bare drives, then you
need to spend a lot of time and effort getting your data onto them, and
then wondering if they will still work after a few years. Modern drives
are much more reliable than they used to be, but maybe you want two
copies? Or a parity disk? What you pay for when you buy a NAS,
particularly a high-end NAS like NetApp is the cost and quality of
management. Rolled into the price of the product is not just redundant
bits and the wires to connect them, but a team of people who get paid to
make sure your data are always safe and available.
The question then always comes down to cost/benefit. What is the
consequence of data loss? What is the probability of data loss? And are
you feeling lucky?
A few years ago I got a panicked email from a user whom I will not name,
but this user had just been "Rupp-ed". As in Bernhard had found a
deposit of theirs that look a lot like a fake structure, and asked about
it. This deposition had been made ten years earlier, the student who
did it had left science, and could not be reached. This left the PI
holding the bag. Turns out the student had made a mistake and deposited
Fcalc instead of Fobs. But how do you prove that? This user was VERY
happy to find out that I still had their images on DVD. I was able to
restore them and re-process them in about an hour.
Lucky? Perhaps. Not every beamline at every synchrotron backs up data,
and not every DVD I've written can be read back. About 3000 images are
still unrecoverable from those days. On the other hand, there are other
beamlines who make a point of destroying any traces of user data as part
of their data protection plan. Most, I think, are middle-of-the-road
with a data retention policy like "we'll do what we can, but can't
promise anything". Even at the same synchrotron policies can vary from
beamline to beamline. So again: do you feel lucky? Do you?
-James Holton
MAD Scientist
On 7/13/2018 2:30 AM, Sergei Strelkov wrote:
Dear All,
I believe this question may be of some interest.
In the past, we always stored all raw data ever collected by the lab.
With the recent advances, such as
(a) automated/on-the-fly processing offered by some (European)
synchrotrons, and
(b) an ongoing discussion on centralized raw data archiving,
I wonder if it is time to revise the strict policy of keeping all data
(before we invest in a new NAS system... )
Best wishes,
Sergei
Prof. Sergei V. Strelkov Laboratory for Biocrystallography Department
of Pharmaceutical Sciences, KU Leuven
------------------------------------------------------------------------
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1