Re: [ccp4bb] raw data deposition

Jacob Keller Thu, 27 Oct 2011 15:10:18 -0700

Since this hasn't been brought up--there is the consideration that in
10 or more years maybe x-ray crystallography will be completely a
thing of the past, with some kind of massively-superior modality
taking over. Of course there is no way to bank on this, but I am
wondering whether this is something to consider or not. Do we really
think people will still be crystallizing proteins in 50 years, or far
less, looking up structures determined in 2011? Has anybody recently
used the original myoglobin structure?


JPK


On Thu, Oct 27, 2011 at 4:55 PM, Michel Fodje
<michel.fo...@lightsource.ca> wrote:
> We store raw data for two main reasons:
> a)  We currently use only a fraction of the information actually contained in 
> raw images and extraction of that fraction can be improved. Destroying the 
> data means
> - we lose the extra information, and make future research in some areas 
> either impossible or more costly
> - we make it more difficult to improve current data reduction methods
> b)  Raw data is the best way to independently validate a published structure 
> and prevent fraud.
>
> The majority of crystallographers already recognize these truths. That is why 
> almost all of them do keep backups of their data even after structures have 
> been published.
>
> To those still against making data public I would ask a simple question:  
> Would you object to providing the raw data from a published structure if such 
> data were available and you did not have to bear an unreasonable 
> inconvenience in the process? My guess is that most crystallographers are 
> reasonable scientists and such a "Poll" will probably result in ~100% "Yes" 
> and ~0% "No". I'm I wrong?
>
> The real issue then is how do we make the data available in such a way that 
> the inconvenience (if any) to all the stake-holders is reasonable.  Some 
> great ideas have already been advanced.
>
> In the short-term,  we could start by using the fact that synchrotron 
> facilities already store raw data for a period. However, a lot of data is 
> collected which is not published. Given the limited disk space, it may be 
> useful to know exactly which datasets result in a publication and should be 
> kept for an extended period. If a unique ID (such as the DOI suggestion) is 
> provided to every dataset and required during deposition/publication, then 
> synchrotron facilities can preserve only those datasets which have been 
> published after a given "grace" period. Combined with a central Meta-data 
> server similar to TARDIS, such a system could be developed in a relatively 
> short period of time, while longer term central storage ideas are worked out.
>
> Again the best solution is going to be one which requires the least amount of 
> effort from crystallographers. In fact, I can see a system in which the 
> experiment metadata for a PDB entry/dataset comes directly from the 
> synchrotron facility during deposition so that users simply provide a unique 
> dataset ID and the experimental details are pre-filled for them.
>
> Of course the above completely ignores home sources.
>
>
> /Michel
>> -----Original Message-----
>> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of D
>> Bonsor
>> Sent: October-27-11 3:10 PM
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: Re: [ccp4bb] raw data deposition
>>
>> Why should we store images?
>>
>> From most of the posts it seems to aid in software development. If that is
>> the case, there should be a Failed Protein Databank (FPDB) where people
>> could upload datasets which they cannot solve. This would aid software
>> development and allow someone else to have ago at solving the structure.
>>
>> If it is for historical reasons, how can someone decide whether their
>> structure is historical? I would propose that images should be uploaded for a
>> protein or protein-complex that has never be solved before. That way the
>> images are there if that structure does become historical.
>>
>> The question is not whether or not images should be uploaded but who
>> would use the images that were uploaded.
>>
>> For example, people who use crystallography as a tool to aid in
>> characterization of their protein, would probably not look at images for 
>> 99.5%
>> of other protein datasets, and they probably would not look at images for a
>> protein that is related to their own protein. They are more interested in the
>> final structure. I too would probably not be interested in reprocessing and
>> solving a structure again when I can easily access the final product already.
>



-- 
*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu
*******************************************

Re: [ccp4bb] raw data deposition

Reply via email to