[ccp4bb] database-assisted data archive

2010-08-18 Thread Andreas Förster

Dear all,

going through some previous lab member's data and trying to make sense 
of it, I was wondering what kind of solutions exist to simply the 
archiving and retrieval process.


In particular, what I have in mind is a web interface that allows a user 
who has just returned from the synchrotron or the in-house detector to 
fill in a few boxes (user, name of protein, mutant, light source, 
quality of data, number of frames, status of project, etc) and then 
upload his data from the USB stick, portable hard drive or remote storage.


The database application would put the data in a safe place (some file 
server that's periodically backed up) and let users browse through all 
the collected data of the lab with minimal effort later.


I doesn't seem too hard to implement this, which is why I'm asking if 
anyone has done so already.


Thanks.


Andreas

--
Andreas Förster, Research Associate
Paul Freemont  Xiaodong Zhang Labs
Department of Biochemistry, Imperial College London
http://www.msf.bio.ic.ac.uk


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Jürgen Bosch
Do you want the frames to be accessible too ?
If not, then a.wiki would be an easy solution.
Alternatively a Filemaker database would do the trick too.

Jürgen 

..
Jürgen Bosch
Johns Hopkins Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Phone: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-3655
http://web.mac.com/bosch_lab/

On Aug 18, 2010, at 5:52, Andreas Förster docandr...@gmail.com wrote:

 Dear all,
 
 going through some previous lab member's data and trying to make sense 
 of it, I was wondering what kind of solutions exist to simply the 
 archiving and retrieval process.
 
 In particular, what I have in mind is a web interface that allows a user 
 who has just returned from the synchrotron or the in-house detector to 
 fill in a few boxes (user, name of protein, mutant, light source, 
 quality of data, number of frames, status of project, etc) and then 
 upload his data from the USB stick, portable hard drive or remote storage.
 
 The database application would put the data in a safe place (some file 
 server that's periodically backed up) and let users browse through all 
 the collected data of the lab with minimal effort later.
 
 I doesn't seem too hard to implement this, which is why I'm asking if 
 anyone has done so already.
 
 Thanks.
 
 
 Andreas
 
 -- 
 Andreas Förster, Research Associate
 Paul Freemont  Xiaodong Zhang Labs
 Department of Biochemistry, Imperial College London
 http://www.msf.bio.ic.ac.uk


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Paul Paukstelis
I did something like that for plasmids by putting together a web 
interface, php, and MySQL. It was simple, maybe a little ugly, but 
worked nicely. The problem was convincing anyone to actually use it was 
virtually impossible.


--paul

On 08/18/2010 04:52 AM, Andreas Förster wrote:

Dear all,

going through some previous lab member's data and trying to make sense
of it, I was wondering what kind of solutions exist to simply the
archiving and retrieval process.

In particular, what I have in mind is a web interface that allows a user
who has just returned from the synchrotron or the in-house detector to
fill in a few boxes (user, name of protein, mutant, light source,
quality of data, number of frames, status of project, etc) and then
upload his data from the USB stick, portable hard drive or remote storage.

The database application would put the data in a safe place (some file
server that's periodically backed up) and let users browse through all
the collected data of the lab with minimal effort later.

I doesn't seem too hard to implement this, which is why I'm asking if
anyone has done so already.

Thanks.


Andreas



--
Paul Paukstelis, Ph.D
Assistant Professor
University of Maryland
Chemistry  Biochemistry Dept.
Center for Biomolecular Structure  Organization
pauks...@umd.edu
301-405-9933


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Georgios Pelios
Dear all

As CCP4, we are currently developing the new CCP4i that will include a database 
application that will store project and job data. The database schema has 
already been designed but its design is not final and can be modified depending 
on user feedback. Now, we are in the process of writing the database API. Any 
suggestions and ideas regarding data storage and retrieval are welcome. 

George Pelios
CCP4



-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Andreas 
Förster
Sent: 18 August 2010 10:53
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] database-assisted data archive

Dear all,

going through some previous lab member's data and trying to make sense 
of it, I was wondering what kind of solutions exist to simply the 
archiving and retrieval process.

In particular, what I have in mind is a web interface that allows a user 
who has just returned from the synchrotron or the in-house detector to 
fill in a few boxes (user, name of protein, mutant, light source, 
quality of data, number of frames, status of project, etc) and then 
upload his data from the USB stick, portable hard drive or remote storage.

The database application would put the data in a safe place (some file 
server that's periodically backed up) and let users browse through all 
the collected data of the lab with minimal effort later.

I doesn't seem too hard to implement this, which is why I'm asking if 
anyone has done so already.

Thanks.


Andreas

-- 
 Andreas Förster, Research Associate
 Paul Freemont  Xiaodong Zhang Labs
Department of Biochemistry, Imperial College London
 http://www.msf.bio.ic.ac.uk


[ccp4bb] Refinement/Model-Building Density Modification

2010-08-18 Thread Michael Thompson
Hello All,

I am currently solving a structure at 2A resolution with phases obtained from 
molecular replacement. Using the MR solution, I began refinement with Refmac 
using NCS restraints. I am currently building the parts of the model that were 
left out of the MR search model, and have just about successfully completed all 
three NCS-related chains. Obviously I will continue to use the most complete 
model for refinement, and plan to release the NCS restraints over the parts of 
the molecule that don't quite seem to obey perfectly. 

My question is, once I have connected density for all three chains will it 
still be worthwhile to perform density modification, such as solvent 
flipping/flattening or histogram matching (implemented through SOLOMON and/or 
DM) to improve phases? It seems that I have always been told that density 
modification is typically carried out at the beginning of refinement, prior to 
any model building, however my understanding of these types of density 
modification (particularly solvent flipping/flattening) leads me to believe 
that they would be most effective when more of the molecular envelope can be 
identified, such as during later stages of refinement.

Also, I read something recently that lead me to believe that the solvent 
flattening procedure may be implicit in the implementation of NCS averaging in 
refinement software. I understand that the two processes are fundamentally 
different and independent of one another, but the information I recently read 
described something like the following (unless I misinterpreted). Because real 
space NCS averaging requires identification of the molecular envelope in the 
same fashion as solvent flattening, during NCS averaging the envelope is 
identified then the map is solvent-flattened and averaged using the NCS 
operator. I am unfamiliar with the inner-workings of most crystallographic 
software, so I was wondering if this is how NCS averaging is implemented in 
Refmac? I suppose another way to ask the question would be: If I have an 
NCS-averaged map from Refmac, is it already solvent-flattened?

Any help would be much appreciated. I am still relatively new to the refinement 
process.

Thanks,

Mike Thompson


-- 
Michael C. Thompson

Graduate Student

Biochemistry  Molecular Biology Division

Department of Chemistry  Biochemistry

University of California, Los Angeles

mi...@chem.ucla.edu


Re: [ccp4bb] Refinement/Model-Building Density Modification

2010-08-18 Thread Kevin Cowtan

Michael Thompson wrote:

Hello All,

I am currently solving a structure at 2A resolution with phases obtained from molecular replacement. Using the MR solution, I began refinement with Refmac using NCS restraints. I am currently building the parts of the model that were left out of the MR search model, and have just about successfully completed all three NCS-related chains. Obviously I will continue to use the most complete model for refinement, and plan to release the NCS restraints over the parts of the molecule that don't quite seem to obey perfectly. 


My question is, once I have connected density for all three chains will it 
still be worthwhile to perform density modification, such as solvent 
flipping/flattening or histogram matching (implemented through SOLOMON and/or 
DM) to improve phases? It seems that I have always been told that density 
modification is typically carried out at the beginning of refinement, prior to 
any model building, however my understanding of these types of density 
modification (particularly solvent flipping/flattening) leads me to believe 
that they would be most effective when more of the molecular envelope can be 
identified, such as during later stages of refinement.


My feeling is that density modification is most useful when the model is 
substantially incomplete (or substantially wrong). But I'd use both the 
unmodified and modified map when trying to interpret the difficult bits.


'dm' is obsolete, I'd use 'parrot' by preference.


Also, I read something recently that lead me to believe that the solvent 
flattening procedure may be implicit in the implementation of NCS averaging in 
refinement software. I understand that the two processes are fundamentally 
different and independent of one another, but the information I recently read 
described something like the following (unless I misinterpreted). Because real 
space NCS averaging requires identification of the molecular envelope in the 
same fashion as solvent flattening, during NCS averaging the envelope is 
identified then the map is solvent-flattened and averaged using the NCS 
operator. I am unfamiliar with the inner-workings of most crystallographic 
software, so I was wondering if this is how NCS averaging is implemented in 
Refmac? I suppose another way to ask the question would be: If I have an 
NCS-averaged map from Refmac, is it already solvent-flattened?


Refinement with NCS restraints essentially imposes the NCS in a way 
which is different but redundant with density modification, so I 
wouldn't expect further density modification after refinement with NCS 
to help much. The same is somewhat true for solvent flattening too.


The only thing density modification might get you at this stage is break 
you away from model bias a little. Again, it's so quick you may as well 
try it and look at both maps.


Kevin

--
EMAIL DISCLAIMER http://www.york.ac.uk/docs/disclaimer/email.htm


Re: [ccp4bb] Refinement/Model-Building Density Modification

2010-08-18 Thread David Schuller

 On 08/18/10 08:25, Michael Thompson wrote:

Hello All,

I am currently solving a structure at 2A resolution with phases obtained from 
molecular replacement. Using the MR solution, I began refinement with Refmac 
using NCS restraints. I am currently building the parts of the model that were 
left out of the MR search model, and have just about successfully completed all 
three NCS-related chains. Obviously I will continue to use the most complete 
model for refinement, and plan to release the NCS restraints over the parts of 
the molecule that don't quite seem to obey perfectly.

My question is, once I have connected density for all three chains will it 
still be worthwhile to perform density modification, such as solvent 
flipping/flattening or histogram matching (implemented through SOLOMON and/or 
DM) to improve phases? It seems that I have always been told that density 
modification is typically carried out at the beginning of refinement, prior to 
any model building, however my understanding of these types of density 
modification (particularly solvent flipping/flattening) leads me to believe 
that they would be most effective when more of the molecular envelope can be 
identified, such as during later stages of refinement.
As building  refinement are a cyclical process, your density-modified 
phases will keep improving as your improved model provides a 
better-fitting mask.  Once you reach the point where your model phases 
are better than the density-modified phases, and improvement from the dm 
maps is no longer apparent, you can quit the density modification. I 
tend to continue the dm longer than some other people, because it is 
free of some forms of model bias.



Also, I read something recently that lead me to believe that the solvent 
flattening procedure may be implicit in the implementation of NCS averaging in 
refinement software. I understand that the two processes are fundamentally 
different and independent of one another, but the information I recently read 
described something like the following (unless I misinterpreted). Because real 
space NCS averaging requires identification of the molecular envelope in the 
same fashion as solvent flattening, during NCS averaging the envelope is 
identified then the map is solvent-flattened and averaged using the NCS 
operator. I am unfamiliar with the inner-workings of most crystallographic 
software, so I was wondering if this is how NCS averaging is implemented in 
Refmac? I suppose another way to ask the question would be: If I have an 
NCS-averaged map from Refmac, is it already solvent-flattened?

Different density modification software packages deal with the averaging 
and solvent-flattening masks in different ways. Some packages 
solvent-flatten everything outside the NCS mask. Having dealt with 
situations in which the NCS relationship does not cover the entire 
molecule, I do not like that, and prefer that NCS and solvent flattening 
masks be dealt with separately (Schuller (1996)  Acta Crystallogr. D52, 
425-434.) IIRC, dm allows the option of dealing with the masks 
separately and I cannot remember which behaviour is the default.


REFMAC is not a density modification application, but rather a 
refinement application. NCS restraints may apply to only part of a 
molecule if necessary.

Any help would be much appreciated. I am still relatively new to the refinement 
process.

Thanks,

Mike Thompson





--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Enrico Stura
Knowing where all the important files are is really all that is needed.  
Sofistication can come later.

I would welcome a CCP4 database-assisted data archive system.

Here is my contribution to the discussion:

I agree with Paul Paukstelis that getting users to use any  
database-assisted data archive system
is the biggest obstacle. I have had problems with compliance with my  
system, where all that the student
has to do is to provide file and directory names each Friday to keep the  
database up to date.


It is a simple html based access system where through hyperlinks one can  
access the data anywhere
where it is stored. Users need only provide the directories names of where  
the various pieces
of data are stored within the accessible network and the data manager (any  
HTML competent individual)
can then set-up the links to the main control platform (start-up html  
page).
The advantage of such system is that it is platform independent and needs  
only a well configured browser.

It is backward compatible with any old data.

George Pelios may want to consider an automated system where mosflm, scala  
and all subsequent
programs contribute to create and update a raw data retrieval file  on the  
basis of the files
they have used. When the project is finished a backup program should be  
able to retrieve
all such files to be stored in a consolidated manner for transfer to a  
long term storage server.


A brief description of the system I use for synchrotron data collection:

Prior to the synchrotron trip, each sample taken to the synchrotron is  
entered in a table that represents its position in
the puck with hyperlinks to a file describing its position in the  
crystallization tray (this file will have hyperlinks

to crystallization and all prior preparation steps).
As data is collected a short comment (resolution and number of frames is  
included if data has been
collected) as the data is transfered in the home lab a link to the  
directory where the data is

stored is then added.
To give an idea of data quality Mosflm and gimp screen capture are used to  
create a jpg of
the first data image (with the frame filename added) which is stored in  
the same directory as

the raw data frames. This image is accessed when clicking on the comment.
Compliance with the system can be checked by clicking on comments other  
than not tested.


It is all manual but is not very time consuming once the initial html  
templates have been
set up. Still I am looking foward to a simple CCP4 designed system that  
can do something similar

automatically.

I would also recommend looking at ispyb implemented at the ESRF which is  
also web based:

www.esrf.eu/UsersAndScience/Experiments/MX/Software/ispyb

Enrico.

--
Enrico A. Stura D.Phil. (Oxon) ,Tel: 33 (0)1 69 08 4302 Office
Room 19, Bat.152,   Tel: 33 (0)1 69 08 9449Lab
LTMB, SIMOPRO, IBiTec-S, CE Saclay, 91191 Gif-sur-Yvette,   FRANCE
http://www-dsv.cea.fr/en/institutes/institute-of-biology-and-technology-saclay-ibitec-s/unites-de-recherche/department-of-molecular-engineering-of-proteins-simopro/molecular-toxinology-and-biotechnology-laboratory-ltmb/crystallogenesis-e.-stura
http://www.chem.gla.ac.uk/protein/mirror/stura/index2.html
e-mail: est...@cea.fr Fax: 33 (0)1 69 08 90 71


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Matthew BOWLER

Dear All,
   I would just like to add to Enrico's mention of ISPyB.  This LIMS 
system will log all your data collected at the beamline (experimental 
parameters, screening images, data sets, edge scans, xrf spectra, 
crystal snapshots etc) automatically and is stored indefinitely.  Your 
colleagues can also follow data collections in real time by logging on 
from their home labs.  In addition, you can upload large amounts of 
information on your samples (acronym, space group, pin barcode etc) to 
the data base that can be recovered at the beamline through MXCuBE and 
the sample changer, tying all data collections to this information.  You 
can also track your dewars to and from the ESRF using it - even 
receiving an email when it reaches the beamline. It has recently delved 
into the world of data analysis, as you can rank crystals against each 
other using a number of criteria.  For those not in an exclusive 
relationship with the ESRF, you will be glad to hear it is also 
available at Diamond and I believe will be at PETRAIII.


Cheers, Matt


Some links:

ISPyB: 
http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB


Sample tracking: 
http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB/ispyb-dewar-tracking


Ranking:  
http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB/ispyb-sample-ranking





Enrico Stura wrote:
Knowing where all the important files are is really all that is 
needed. Sofistication can come later.

I would welcome a CCP4 database-assisted data archive system.

Here is my contribution to the discussion:

I agree with Paul Paukstelis that getting users to use any 
database-assisted data archive system
is the biggest obstacle. I have had problems with compliance with my 
system, where all that the student
has to do is to provide file and directory names each Friday to keep 
the database up to date.


It is a simple html based access system where through hyperlinks one 
can access the data anywhere
where it is stored. Users need only provide the directories names of 
where the various pieces
of data are stored within the accessible network and the data manager 
(any HTML competent individual)
can then set-up the links to the main control platform (start-up html 
page).
The advantage of such system is that it is platform independent and 
needs only a well configured browser.

It is backward compatible with any old data.

George Pelios may want to consider an automated system where mosflm, 
scala and all subsequent
programs contribute to create and update a raw data retrieval file  on 
the basis of the files
they have used. When the project is finished a backup program should 
be able to retrieve
all such files to be stored in a consolidated manner for transfer to a 
long term storage server.


A brief description of the system I use for synchrotron data collection:

Prior to the synchrotron trip, each sample taken to the synchrotron is 
entered in a table that represents its position in
the puck with hyperlinks to a file describing its position in the 
crystallization tray (this file will have hyperlinks

to crystallization and all prior preparation steps).
As data is collected a short comment (resolution and number of frames 
is included if data has been
collected) as the data is transfered in the home lab a link to the 
directory where the data is

stored is then added.
To give an idea of data quality Mosflm and gimp screen capture are 
used to create a jpg of
the first data image (with the frame filename added) which is stored 
in the same directory as

the raw data frames. This image is accessed when clicking on the comment.
Compliance with the system can be checked by clicking on comments 
other than not tested.


It is all manual but is not very time consuming once the initial html 
templates have been
set up. Still I am looking foward to a simple CCP4 designed system 
that can do something similar

automatically.

I would also recommend looking at ispyb implemented at the ESRF which 
is also web based:

www.esrf.eu/UsersAndScience/Experiments/MX/Software/ispyb

Enrico.



--
Matthew Bowler
Structural Biology Group
European Synchrotron Radiation Facility
B.P. 220, 6 rue Jules Horowitz
F-38043 GRENOBLE CEDEX
FRANCE
===
Tel: +33 (0) 4.76.88.29.28
Fax: +33 (0) 4.76.88.29.04

http://www.esrf.fr/UsersAndScience/Experiments/MX/
=== 


Re: [ccp4bb] autoBuster--Rfree_falg

2010-08-18 Thread Clemens Vonrhein
Dear Jerry,

On Tue, Aug 17, 2010 at 04:26:14PM -0700, Jerry McCully wrote:
 Dear All:
 
  I am currently using autoBuster to refine my structure. I notice that 
 autobuster generates a new column in the MTZ data file with the label of 
 FreeR_flag.
 
 Because my MTZ file has already had the FreeRflag, I am wondering whether 
 autoBuster generated a new set of Rfreeflags of just renamed the existing 
 FreeRflags.
 
  Can anyone give some comments?

Yes, it created a new column because the existing one has a label
different from the supported ones - see:

  
http://www.globalphasing.com/buster/manual/autobuster/manual/appendix1.html#SetvarParameter_ColumnName_FreeR_flag_allowed

What you could do is to add the following line to a ~/.autoBUSTER
file:

ColumnName_FreeR_flag_allowed= I FreeR_flag| I FREE| I FreeRflag

Or use this on the command-line:

  % refine ColumnName_FreeR_flag_allowed= I FreeRflag \
   -p some.pdb -m other.mtz ...

Both should work.

Alternatively you might want to use the more common (?) labels FREE
or FreeR_flag when you create your MTZ file.

Cheers

Clemens

PS: for BUSTER questions you can also use
buster-deve...@globalphasing.com

-- 

***
* Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com
*
*  Global Phasing Ltd.
*  Sheraton House, Castle Park 
*  Cambridge CB3 0AX, UK
*--
* BUSTER Development Group  (http://www.globalphasing.com)
***


[ccp4bb] Scaling up from an Intelliplate to Linbro Plate

2010-08-18 Thread Mo Wong
Hi all,

I know scaling up from a hit found from a high throughput screen is an
empirical process, but does anyone have a good rule of thumb as a starting
point when it comes to scaling up from a hit observed in an Intelliplate to
a Linbro plate (i.e., change in volume ratios, amount to add to reservoir,
etc)? I've Googled around but haven't seen anything which either suggests I
shouldn't be asking this question, I've not looked hard enough, or it really
is a case of try and see.

Thanks


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread James Holton
There is an image archiving system called TARDIS (http://tardis.edu.au/) 
that sounds more-or-less exactly like what you describe. 

I agree that it would be nice if you can get your synchrotron to do it 
for you, but since every single beamline and home-source setup in the 
world has already been providing you with a database that is more 
commonly called the image header, I don't think it is too hard to 
imagine how accurate the data in your database is going to be.


If I may interject my two cents, I have found that when a user is asked 
to fill out a form, compliance is inversely proportional to the number 
of fields on the form.  But far more important than that: if you ask 
them to answer a question that they simply don't know the answer to, 
they will likely skip the whole thing.  An excellent example (I think) 
is asking for the space group BEFORE they have even taken their first 
snapshot of a brand new crystal.  This datum is simply not known until 
AFTER the structure is solved!  For example, is it P41 or P43?  You 
don't really know that until after you see a helix in the map.  What 
is the molecular weight?  That depends on whether or not it is a 
complex. (if I had a nickel for every user who was certain they had a 
protein-DNA complex with a very low solvent content, I would be quite 
rich).


All that said, I don't think it is unreasonable to expect an image 
header (or any other database) to contain motor positions, detector 
type, wavelength, beam center etc. Clearly this is not always the case, 
and this problem still needs a lot of work, but my point is that we 
should try to write down things that we really know (observations) and 
not try to muddle the database with derived quantities (interpretations).


When it comes to what you really know about the sample, all you can 
realistically hope to be sure of is the list of chemicals that went into 
the drop: macromolecule sequence, salts, PEGs, and their respective 
concentrations.  Sometimes you don't even kow that! (i.e. proteolysis).  
However, the macromolecule sequence is INCREDIBLY useful for deriving 
(or at least guessing) a great many other things (such as the molecular 
weight, solvent content, number of heavy atom sites).  The list of salts 
is also absolutely critical for doing radiation damage predictions. 

So, as my rant comes to an end, I would strongly suggest focusing on 
trying to capture the important things that we actually do know, rather 
than confusing our poor users further by asking them to write down a lot 
of things that they don't.


-James Holton
MAD Scientist

Andreas Förster wrote:

Dear all,

going through some previous lab member's data and trying to make sense 
of it, I was wondering what kind of solutions exist to simply the 
archiving and retrieval process.


In particular, what I have in mind is a web interface that allows a 
user who has just returned from the synchrotron or the in-house 
detector to fill in a few boxes (user, name of protein, mutant, light 
source, quality of data, number of frames, status of project, etc) and 
then upload his data from the USB stick, portable hard drive or remote 
storage.


The database application would put the data in a safe place (some file 
server that's periodically backed up) and let users browse through all 
the collected data of the lab with minimal effort later.


I doesn't seem too hard to implement this, which is why I'm asking if 
anyone has done so already.


Thanks.


Andreas



Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Berry, Ian
What about XTrack?
http://xray.bmc.uu.se/xtrack/



-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of James 
Holton
Sent: 18 August 2010 16:54
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] database-assisted data archive

There is an image archiving system called TARDIS (http://tardis.edu.au/) 
that sounds more-or-less exactly like what you describe. 

I agree that it would be nice if you can get your synchrotron to do it 
for you, but since every single beamline and home-source setup in the 
world has already been providing you with a database that is more 
commonly called the image header, I don't think it is too hard to 
imagine how accurate the data in your database is going to be.

If I may interject my two cents, I have found that when a user is asked 
to fill out a form, compliance is inversely proportional to the number 
of fields on the form.  But far more important than that: if you ask 
them to answer a question that they simply don't know the answer to, 
they will likely skip the whole thing.  An excellent example (I think) 
is asking for the space group BEFORE they have even taken their first 
snapshot of a brand new crystal.  This datum is simply not known until 
AFTER the structure is solved!  For example, is it P41 or P43?  You 
don't really know that until after you see a helix in the map.  What 
is the molecular weight?  That depends on whether or not it is a 
complex. (if I had a nickel for every user who was certain they had a 
protein-DNA complex with a very low solvent content, I would be quite 
rich).

All that said, I don't think it is unreasonable to expect an image 
header (or any other database) to contain motor positions, detector 
type, wavelength, beam center etc. Clearly this is not always the case, 
and this problem still needs a lot of work, but my point is that we 
should try to write down things that we really know (observations) and 
not try to muddle the database with derived quantities (interpretations).

When it comes to what you really know about the sample, all you can 
realistically hope to be sure of is the list of chemicals that went into 
the drop: macromolecule sequence, salts, PEGs, and their respective 
concentrations.  Sometimes you don't even kow that! (i.e. proteolysis).  
However, the macromolecule sequence is INCREDIBLY useful for deriving 
(or at least guessing) a great many other things (such as the molecular 
weight, solvent content, number of heavy atom sites).  The list of salts 
is also absolutely critical for doing radiation damage predictions. 

So, as my rant comes to an end, I would strongly suggest focusing on 
trying to capture the important things that we actually do know, rather 
than confusing our poor users further by asking them to write down a lot 
of things that they don't.

-James Holton
MAD Scientist

Andreas Förster wrote:
 Dear all,

 going through some previous lab member's data and trying to make sense 
 of it, I was wondering what kind of solutions exist to simply the 
 archiving and retrieval process.

 In particular, what I have in mind is a web interface that allows a 
 user who has just returned from the synchrotron or the in-house 
 detector to fill in a few boxes (user, name of protein, mutant, light 
 source, quality of data, number of frames, status of project, etc) and 
 then upload his data from the USB stick, portable hard drive or remote 
 storage.

 The database application would put the data in a safe place (some file 
 server that's periodically backed up) and let users browse through all 
 the collected data of the lab with minimal effort later.

 I doesn't seem too hard to implement this, which is why I'm asking if 
 anyone has done so already.

 Thanks.


 Andreas

Evotec (UK) Ltd is a limited company registered in England and Wales. 
Registration number:2674265. Registered office: 114 Milton Park, Abingdon, 
Oxfordshire, OX14 4SA, United Kingdom.


Re: [ccp4bb] Scaling up from an Intelliplate to Linbro Plate

2010-08-18 Thread Patrick Shaw Stewart
Hi Mo

 

What you need to remember is that a relatively large amount of protein
is lost from smaller drops.  The ratio of surface area to volume is
greater.  With 100 + 100 nl drops about half of the protein is lost,
either as skin on the drops or on the plastic of the plate.

 

So when you scale up you need to reduce the protein by about half.
(Another approach, suggested by Heather Ringrose, is to put extra
protein into the drops at the screening stage, e.g. 200 nl protein + 100
nl reservoir solution.  The hits found can usually be scaled up by
dispensing 1 + 1 microlitre drops.)

 

This is counterintuitive because people expect the small drops to dry
out more quickly - so they expect, if anything, to get more
precipitation in the small drops.  Instead they get precipitation when
they scale up, assuming they keep the ratio of protein to reservoir
constant.

 

 

It can also help, when you scale up, to increase the salt by 50 to 100%
- this is indicated by data mining but I'm not sure what the mechanism
is

 

Hope that's helpful

 

Patrick

 

 

--

For information and discussion about protein crystallization and
automation, please join 

our bulletin board at
http://groups-beta.google.com/group/oryx_group?hl=en

 

 patr...@douglas.co.ukDouglas Instruments Ltd.

 DouglasHouse, EastGarston, Hungerford, Berkshire, RG177HD, UK

 Directors: Peter Baldock, Patrick Shaw Stewart

 http://www.douglas.co.uk/

 Tel: 44 (0) 148-864-9090US toll-free 1-877-225-2034

 Regd. England 2177994, VAT Reg. GB 480 7371 36

 

From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Mo
Wong
Sent: 18 August 2010 16:18
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Scaling up from an Intelliplate to Linbro Plate

 

Hi all,

I know scaling up from a hit found from a high throughput screen is an
empirical process, but does anyone have a good rule of thumb as a
starting point when it comes to scaling up from a hit observed in an
Intelliplate to a Linbro plate (i.e., change in volume ratios, amount to
add to reservoir, etc)? I've Googled around but haven't seen anything
which either suggests I shouldn't be asking this question, I've not
looked hard enough, or it really is a case of try and see.

Thanks



[ccp4bb] Job opportunities at the Protein Data Bank in Europe

2010-08-18 Thread Gerard DVD Kleywegt
There are three more vacancies coming up at the Protein Data Bank in Europe 
(PDBe; pdbe.org):


- Head of PDBe Deposition and Annotation
  
http://ig14.i-grasp.com/fe/tpl_embl01.asp?s=MbkMjPUrEcTFkHhTczjobid=40182,2388233441

- 50% Oracle DBA/50% Senior Software Engineer
  
http://ig14.i-grasp.com/fe/tpl_embl01.asp?s=AdmOlRWtGeVHmJjVebjobid=40179,2133526947

- Scientific Programmer
  
http://ig14.i-grasp.com/fe/tpl_embl01.asp?s=PyAxDIfSqHTyVvHqnjobid=40183,4715140288

For other job opportunities at the EBI, see 
http://www.embl.de/aboutus/jobs/searchjobs/index.php?searchregion=669


--Gerard

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
ger...@ebi.ac.uk . pdbe.org
Secretary: Pauline Haslam  pdbe_ad...@ebi.ac.uk


Re: [ccp4bb] Scaling up from an Intelliplate to Linbro Plate

2010-08-18 Thread Mo Wong
Thank you Patrick for your reply.

As a note to others who might be interested, I found a few comments about
scaling up interwoven in a long thread about which robot to buy that was
posted on this bb a few years ago. The most salient link is probably:

http://www.mail-archive.com/ccp4bb@jiscmail.ac.uk/msg04387.html

Also, I found Patrick has a more detailed description about what should be
of primary consideration during scale-up written in the following post:

http://groups.google.com/group/oryx_group/browse_thread/thread/b04a2d7736d5974d?pli=1

Regards

On Wed, Aug 18, 2010 at 12:56 PM, Patrick Shaw Stewart 
patr...@douglas.co.uk wrote:

  Hi Mo



 What you need to remember is that a relatively large amount of protein is
 lost from smaller drops.  The ratio of surface area to volume is greater.
 With 100 + 100 nl drops about half of the protein is lost, either as skin on
 the drops or on the plastic of the plate.



 So when you scale up you need to reduce the protein by about half.
 (Another approach, suggested by Heather Ringrose, is to put extra protein
 into the drops at the screening stage, e.g. 200 nl protein + 100 nl
 reservoir solution.  The hits found can usually be scaled up by dispensing 1
 + 1 microlitre drops.)



 This is counterintuitive because people expect the small drops to dry out
 more quickly - so they expect, if anything, to get more precipitation in the
 small drops.  Instead they get precipitation when they scale up, assuming
 they keep the ratio of protein to reservoir constant.





 It can also help, when you scale up, to increase the salt by 50 to 100% -
 this is indicated by data mining but I’m not sure what the mechanism is



 Hope that’s helpful



 Patrick





 --

 For information and discussion about protein crystallization and
 automation, please join

 our bulletin board at http://groups-beta.google.com/group/oryx_group?hl=en



  patr...@douglas.co.ukDouglas Instruments Ltd.

  DouglasHouse, EastGarston, Hungerford, Berkshire, RG177HD, UK

  Directors: Peter Baldock, Patrick Shaw Stewart

  http://www.douglas.co.uk/

  Tel: 44 (0) 148-864-9090US toll-free 1-877-225-2034

  Regd. England 2177994, VAT Reg. GB 480 7371 36



 *From:* CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] *On Behalf Of *Mo
 Wong
 *Sent:* 18 August 2010 16:18
 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* [ccp4bb] Scaling up from an Intelliplate to Linbro Plate



 Hi all,

 I know scaling up from a hit found from a high throughput screen is an
 empirical process, but does anyone have a good rule of thumb as a starting
 point when it comes to scaling up from a hit observed in an Intelliplate to
 a Linbro plate (i.e., change in volume ratios, amount to add to reservoir,
 etc)? I've Googled around but haven't seen anything which either suggests I
 shouldn't be asking this question, I've not looked hard enough, or it really
 is a case of try and see.

 Thanks



[ccp4bb] PDBePISA assembly, summary and XML data files available from the Protein Data Bank in Europe

2010-08-18 Thread Gerard DVD Kleywegt
The Protein Data Bank in Europe (PDBe; http://pdbe.org/) is pleased to 
announce the availability of PISA assembly files, summaries and associated XML 
descriptors for all relevant entries in the Protein Data Bank (PDB) archive 
for download and in-house analysis.


PDBePISA (http://pdbe.org/pisa/) is an advanced interactive tool for the 
prediction of probable quaternary structures (assemblies), analysis of 
macromolecular interfaces and surfaces, database searches for similar 
interfaces and assemblies, and retrieval of results based on various search 
criteria. PDBePISA also allows the upload of your own structure in PDB or 
mmCIF format for interface analysis or quaternary structure prediction.


PDBePISA-predicted stable assembly files in PDB format are now available for 
download from the PDBe FTP area for all structures determined by diffraction 
methods. In addition, interface parameters (interface contacts, symmetry 
operators, hydrogen and disulphide bonds, salt bridges etc.) and assembly 
parameters (overall and buried surface areas, dissociation energies, contact 
interface characteristics) are available for every entry in XML format for 
further analysis. A summary index file is also provided in the FTP area 
containing a one-line summary for every stable assembly predicted by PDBePISA. 
This file provides an at-a-glance summary of the salient features of every 
stable assembly predicted by the program. This area is updated every Wednesday 
to coincide with the public update of the PDB archive.


DATA ACCESS
---

Please point your browser to:

 ftp://ftp.ebi.ac.uk/pub/databases/msd/pisa/

The file ftp://ftp.ebi.ac.uk/pub/databases/msd/pisa/index.txt contains the 
one-line summary for each assembly in column delimited format.


Individual entry data may be found in a path like this:

 ftp://ftp.ebi.ac.uk/pub/databases/msd/pisa/data/xx/1xxx/

where 'xx' are the second and third characters of the PDB id code and 1xxx is 
the actual PDB id code. For example, information for PDB entry 1cbr may be 
found under


 ftp://ftp.ebi.ac.uk/pub/databases/msd/pisa/data/cb/1cbr/

The files in this directory will have names like

- 1xxx.pdb.gz or 1xxx_n.pdb.gz for the assembly files (where 'n' is the 
assembly number when PDBePISA predicts multiple assemblies).


- 1xxx_assembly.xml.gz : The assembly description in XML.

- 1xxx_interface.xml.gz: The interface description in XML.

As always, we welcome comments and suggestions on new features (preferably 
using the big, fat FEEDBACK button on the PDBe web pages).


--Gerard

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
ger...@ebi.ac.uk . pdbe.org
Secretary: Pauline Haslam  pdbe_ad...@ebi.ac.uk


Re: [ccp4bb] database-assisted data archive

2010-08-18 Thread Mark Brooks
Dear Andreas,
If you really want to do this, and want to define what is
the data is, it's not _so_ difficult to do it yourself, with Ruby on Rails (
http://rubyonrails.org/)

You have to know how to script a bit, and know a bit about
Model/View/Controller frameworks. http://www.youtube.com/watch?v=Gzj723LkRJY

That's not what you asked, but if you want to define what is the data to be
input, you end up being unhappy with someone else's implementation.

Mark

2010/8/18 Andreas Förster docandr...@gmail.com

 Dear all,

 going through some previous lab member's data and trying to make sense of
 it, I was wondering what kind of solutions exist to simply the archiving and
 retrieval process.

 In particular, what I have in mind is a web interface that allows a user
 who has just returned from the synchrotron or the in-house detector to fill
 in a few boxes (user, name of protein, mutant, light source, quality of
 data, number of frames, status of project, etc) and then upload his data
 from the USB stick, portable hard drive or remote storage.

 The database application would put the data in a safe place (some file
 server that's periodically backed up) and let users browse through all the
 collected data of the lab with minimal effort later.

 I doesn't seem too hard to implement this, which is why I'm asking if
 anyone has done so already.

 Thanks.


 Andreas

 --
Andreas Förster, Research Associate
Paul Freemont  Xiaodong Zhang Labs
 Department of Biochemistry, Imperial College London
http://www.msf.bio.ic.ac.uk




-- 
Skype: markabrooks


Re: [ccp4bb] Scaling up from an Intelliplate to Linbro Plate

2010-08-18 Thread Frank von Delft
I assume you're sure you even *need* to scale up?  Most of our 
structures come from crystals from small (150-300nl) drops, we 
consider a 100um crystal already huge.  And if a smaller crystal 
doesn't diffract far enough on a modern beamline, chances are a large 
one won't either (quite apart from the trouble you'll have 
cryo-protecting it).


(And yes, of course there *are* a few cases where larger = better.)

phx



On 18/08/2010 19:39, Mo Wong wrote:

Thank you Patrick for your reply.

As a note to others who might be interested, I found a few comments 
about scaling up interwoven in a long thread about which robot to buy 
that was posted on this bb a few years ago. The most salient link is 
probably:


http://www.mail-archive.com/ccp4bb@jiscmail.ac.uk/msg04387.html

Also, I found Patrick has a more detailed description about what 
should be of primary consideration during scale-up written in the 
following post:


http://groups.google.com/group/oryx_group/browse_thread/thread/b04a2d7736d5974d?pli=1

Regards

On Wed, Aug 18, 2010 at 12:56 PM, Patrick Shaw Stewart 
patr...@douglas.co.uk mailto:patr...@douglas.co.uk wrote:


Hi Mo

What you need to remember is that a relatively large amount of
protein is lost from smaller drops.  The ratio of surface area to
volume is greater.  With 100 + 100 nl drops about half of the
protein is lost, either as skin on the drops or on the plastic of
the plate.

So when you scale up you need to reduce the protein by about
half.  (Another approach, suggested by Heather Ringrose, is to put
extra protein into the drops at the screening stage, e.g. 200 nl
protein + 100 nl reservoir solution.  The hits found can usually
be scaled up by dispensing 1 + 1 microlitre drops.)

This is counterintuitive because people expect the small drops to
dry out more quickly - so they expect, if anything, to get more
precipitation in the small drops.  Instead they get precipitation
when they scale up, assuming they keep the ratio of protein to
reservoir constant.

It can also help, when you scale up, to increase the salt by 50 to
100% - this is indicated by data mining but I’m not sure what the
mechanism is

Hope that’s helpful

Patrick

--

For information and discussion about protein crystallization and
automation, please join

our bulletin board at
http://groups-beta.google.com/group/oryx_group?hl=en

patr...@douglas.co.uk mailto:patr...@douglas.co.ukDouglas
Instruments Ltd.

 DouglasHouse, EastGarston, Hungerford, Berkshire, RG177HD, UK

 Directors: Peter Baldock, Patrick Shaw Stewart

http://www.douglas.co.uk/

 Tel: 44 (0) 148-864-9090US toll-free 1-877-225-2034

 Regd. England 2177994, VAT Reg. GB 480 7371 36

*From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK
mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of *Mo Wong
*Sent:* 18 August 2010 16:18
*To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
*Subject:* [ccp4bb] Scaling up from an Intelliplate to Linbro Plate

Hi all,

I know scaling up from a hit found from a high throughput screen
is an empirical process, but does anyone have a good rule of thumb
as a starting point when it comes to scaling up from a hit
observed in an Intelliplate to a Linbro plate (i.e., change in
volume ratios, amount to add to reservoir, etc)? I've Googled
around but haven't seen anything which either suggests I shouldn't
be asking this question, I've not looked hard enough, or it really
is a case of try and see.

Thanks