[ccp4bb] AW: [ccp4bb] query regarding RMSD calculation

2013-06-24 Thread Herman . Schreuder
Dear Ansuman,

It is not entirely clear to my what kind of answer you are expecting. As Tim 
mentioned, from the B-factor formula, one can derive an estimate of the 
deviations of atoms from their average positions. This should give some idea of 
the inherent flexibility of the protein. From my experience, I would consider 
RMS deviations of 0.2-0.3Å between protein loops not significant. However, 
movement of an atom of 0.2Å in the active site of an enzyme (e.g. with a 
transition state analog), especially when backed up with positive and negative 
difference electron density peaks, when the atom is forced in its original 
position, could be highly significant.

My 2 cts,
Herman

 

-Ursprüngliche Nachricht-
Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Tim 
Gruene
Gesendet: Sonntag, 23. Juni 2013 19:54
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] query regarding RMSD calculation

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear ansuman,

'rmsd' stands for 'root mean square deviation', i.e. in the context of your 
first question you must sum the _square_ of the atomwise deviations and then 
take the square root.

question two: I contradict: residues ARE static in this context: a 
crystallographic structure model corresponds to the average of all asymmetric 
units in the crystal, hence the resulting coordinates represent an average 
value and therefore are to be considered static.
The fact that we assume that between two asymmetric units certain deviations 
from the average will occur is modelled by the (isotropic /
anisotropic) ADP and the occupancy, respectively, but nevertheless, the model 
that you use for calculating an rmsd is a static one. This may be the reason 
why there is not one answer to your question two, but may give rise to a 
longish discussion.

Best,
Tim

On 06/23/2013 02:50 PM, ansuman biswas wrote:
> Dear all, I have 2 queries regarding RMSD calculation and 
> interpretation.
> 
> 1. When 2 residue stretches are superposed using CCP4 superpose, the 
> log file shows the atomwise-deviations between matched residue pairs. 
> How to find the total deviation/RMSD between a specified residue pair 
> - should the RMSDs of the individual atoms be summed over, or is there 
> some other formula?
> 
> 2. Residues are not static; this dynamic nature is evident from 
> variations (for a given residue; by variations, I mean that the 
> residues don't superpose completely) when multiple chains of the same 
> structure (from multiple chains in AU of a PDB, or from different 
> PDBs) are superposed. My question is: is there an RMSD cut-off below 
> which the variation can be considered as thermal fluctutation, and 
> above which they can be said to be a different 'conformation' or 
> 'state'?
> 
> Thanking all, regards, ansuman
> 

- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRxzZMUxlJ7aRr7hoRAog8AKDistu4ryi4DMUjgHlxXNl0wLV48ACcD1PI
OYiO3+oZTZ3RoBjNKY9GR/I=
=zyHZ
-END PGP SIGNATURE-


[ccp4bb] AW: [ccp4bb] str solving problem

2013-06-24 Thread Herman . Schreuder
Dear Pramod,

To run XDS, you could try to run it via the CPP4 procedure XIA2, or via 
autoPROC from Global Phasing. What also may help is to retype the line:
NAME_TEMPLATE_OF_DATA_FRAMES=../images/WFTig1_???.mar2300  ! MAR345
I have had cases, where there were (I believe) hidden characters in there which 
caused XDS to fail. I would also check that the images XDS complains about, are 
really there and not empty. Your SPOT_RANGE line is commented out. I would 
remove the comment "!" and try different spot ranges, including all images. You 
may also want to play with STRONG_PIXEL, to find a value which finds your major 
diffraction pattern and ignores the minor pattern. Sometimes it helps, to 
specify the known(?) space group and cell dimensions, sometimes it is better to 
leave them out.

XDS output files are normally not multi-megabyte, so you could also consider to 
post the relevant output files (INIT.LP, COLSPOT.LP) to the bullitin board.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Pramod 
Kumar
Gesendet: Freitag, 21. Juni 2013 22:59
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] str solving problem

Dear ...

Francis

Last I remember, HKL2000 bases its indexing on the 'strongest' spots on an 
image (though you could manually select spots). It could result in a misindex 
if the strongest spots come from separate lattices..

I have used both HKL2000 and mosflm giving the same results (although I have 
used manual selection of spots as a trial but results are identical).

Try a program that uses all spots for indexing, across all images (XDS for 
example) and you might get the true space group..

I have given several efforts to the XDS but its giving error  "data image of 
particular no. does not exist (initially it was saying 11th image than i change 
image range then it says 21st and so on) kindly check my data collection 
profile and XDS.INP file in attachment'

Or if the crystal is big enough, you could try shooting it in different areas 
and 'searching' for a better spot to collect data.
Or 'grow a better crystal'.

raising the crystals and struggle is on the peak...


Dear Eugene

plz find the attached scale log file, scaling table of mosflm

When you index spots in Mosflm, do your predictions agree with the spots?

plz see the snapshot of predicted spots..



Dear Eleanor
Yes both the molecule are visible in the ASU.



Dear Pozharski

Balbes pipeline hitting extremely high marks when fed into Phaser while being 
complete nonsense (it's a 150kDa multi-domain protein and resulting domain 
arrangement made absolutely no sense).  Refinement was stuck with high R-values 
and I sadly gave up on it for now.  I suspected that refmac step included in 
the pipeline artificially shifts the model so that it conforms to Patterson map 
better, which results in high score in Phaser.

My domain arrangement is as expected, two molecules in ASU.


thanks and regards

pramod









On Thu, Jun 20, 2013 at 3:50 PM, Eleanor Dodson 
mailto:eleanor.dod...@york.ac.uk>> wrote:
As others say - the Rfactors look pretty good for MR, mine usually start over 
50% even with a better model and one hopes they then decrease..
But you say you took the Balbes model into phaser? and I think Balbes 
automatically runs cycles of refinement so any comment on R factors may not 
mean much.

Have you found both molecules in the asymmetric unit? You only give LLG for one?
Eleanor




On 19 June 2013 17:44, Eugene Valkov 
mailto:eugene.val...@gmail.com>> wrote:
Yes, I would agree with Francis that diffraction shows contribution from 
several lattices, which could lead to misindexing. However, it should be 
feasible to get a model that refines from this sort of data.

Pramod - could you please post your data processing statistics from your 
scaling program? Better if you have several for different spacegroups.

Also, I have no idea how HKL200 does this, but could you please provide an 
indexing solution table from Mosflm that shows penalties associated with each 
type of space group? Was there a sharp penalty drop at some point or was it 
more gradual?

When you index spots in Mosflm, do your predictions agree with the spots? Or is 
there a substantial portion that are missed?

I would consider altering thresholds in Mosflm for indexing (see the manual).

Eugene




On 19 June 2013 17:34, Francis E. Reyes 
mailto:francis.re...@colorado.edu>> wrote:
On Jun 17, 2013, at 12:36 PM, Pramod Kumar 
mailto:pramod...@gmail.com>> wrote:

>> I have a crystal data diffracted  around 2.9 A*,
>> during the data reduction HKL2000 not convincingly showed the space group 
>> (indexed in lower symmetry p1), while the mosflm given C-centered 
>> Orthorhombic, and again with little play around HKL2000 given CO
>



> no ice ring is appeared, diffraction pattern looks ok, misindexing in any 
> direction is not conclusive to me (plz see the imj attachment)

The diffraction does not look 

Re: [ccp4bb] AW: [ccp4bb] str solving problem

2013-06-24 Thread Manfred S. Weiss

There is also XDSAPP, a very convenient GUI for XDS with a fairly high
degree of automation. Try it out, you will like it ...

http://www.helmholtz-berlin.de/forschung/funkma/soft-matter/forschung/bessy-mx/xdsapp/index_en.html

Cheers,

Manfred

On 24.06.2013 10:35, 
herman.schreu...@sanofi.com wrote:
Dear Pramod,

To run XDS, you could try to run it via the CPP4 procedure XIA2, or via 
autoPROC from Global Phasing. What also may help is to retype the line:
NAME_TEMPLATE_OF_DATA_FRAMES=../images/WFTig1_???.mar2300  ! MAR345
I have had cases, where there were (I believe) hidden characters in there which caused 
XDS to fail. I would also check that the images XDS complains about, are really there and 
not empty. Your SPOT_RANGE line is commented out. I would remove the comment 
"!" and try different spot ranges, including all images. You may also want to 
play with STRONG_PIXEL, to find a value which finds your major diffraction pattern and 
ignores the minor pattern. Sometimes it helps, to specify the known(?) space group and 
cell dimensions, sometimes it is better to leave them out.

XDS output files are normally not multi-megabyte, so you could also consider to 
post the relevant output files (INIT.LP, COLSPOT.LP) to the bullitin board.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Pramod 
Kumar
Gesendet: Freitag, 21. Juni 2013 22:59
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] str solving problem

Dear ...

Francis

Last I remember, HKL2000 bases its indexing on the 'strongest' spots on an 
image (though you could manually select spots). It could result in a misindex 
if the strongest spots come from separate lattices..

I have used both HKL2000 and mosflm giving the same results (although I have 
used manual selection of spots as a trial but results are identical).

Try a program that uses all spots for indexing, across all images (XDS for 
example) and you might get the true space group..

I have given several efforts to the XDS but its giving error  "data image of 
particular no. does not exist (initially it was saying 11th image than i change 
image range then it says 21st and so on) kindly check my data collection profile and 
XDS.INP file in attachment'

Or if the crystal is big enough, you could try shooting it in different areas 
and 'searching' for a better spot to collect data.
Or 'grow a better crystal'.

raising the crystals and struggle is on the peak...


Dear Eugene

plz find the attached scale log file, scaling table of mosflm

When you index spots in Mosflm, do your predictions agree with the spots?

plz see the snapshot of predicted spots..



Dear Eleanor
Yes both the molecule are visible in the ASU.



Dear Pozharski

Balbes pipeline hitting extremely high marks when fed into Phaser while being 
complete nonsense (it's a 150kDa multi-domain protein and resulting domain 
arrangement made absolutely no sense).  Refinement was stuck with high R-values 
and I sadly gave up on it for now.  I suspected that refmac step included in 
the pipeline artificially shifts the model so that it conforms to Patterson map 
better, which results in high score in Phaser.

My domain arrangement is as expected, two molecules in ASU.


thanks and regards

pramod









On Thu, Jun 20, 2013 at 3:50 PM, Eleanor Dodson 
mailto:eleanor.dod...@york.ac.uk>> wrote:
As others say - the Rfactors look pretty good for MR, mine usually start over 
50% even with a better model and one hopes they then decrease..
But you say you took the Balbes model into phaser? and I think Balbes 
automatically runs cycles of refinement so any comment on R factors may not 
mean much.

Have you found both molecules in the asymmetric unit? You only give LLG for one?
Eleanor




On 19 June 2013 17:44, Eugene Valkov 
mailto:eugene.val...@gmail.com>> wrote:
Yes, I would agree with Francis that diffraction shows contribution from 
several lattices, which could lead to misindexing. However, it should be 
feasible to get a model that refines from this sort of data.

Pramod - could you please post your data processing statistics from your 
scaling program? Better if you have several for different spacegroups.

Also, I have no idea how HKL200 does this, but could you please provide an 
indexing solution table from Mosflm that shows penalties associated with each 
type of space group? Was there a sharp penalty drop at some point or was it 
more gradual?

When you index spots in Mosflm, do your predictions agree with the spots? Or is 
there a substantial portion that are missed?

I would consider altering thresholds in Mosflm for indexing (see the manual).

Eugene




On 19 June 2013 17:34, Francis E. Reyes 
mailto:francis.re...@colorado.edu>> wrote:
On Jun 17, 2013, at 12:36 PM, Pramod Kumar 
mailto:pramod...@gmail.com>> wrote:


I have a crystal data diffracted  around 2.9 A*,
during the data reduction H

Re: [ccp4bb] AW: [ccp4bb] query regarding RMSD calculation

2013-06-24 Thread Frank von Delft
Dear Ansuman - I suspect Escet is what you're after: 
http://webapps.embl-hamburg.de/escet/


It factors in coordinate accuracy into comparisons.  Check out the 
references for explanation.



On 24/06/2013 08:55, herman.schreu...@sanofi.com wrote:

Dear Ansuman,

It is not entirely clear to my what kind of answer you are expecting. As Tim 
mentioned, from the B-factor formula, one can derive an estimate of the 
deviations of atoms from their average positions. This should give some idea of 
the inherent flexibility of the protein. From my experience, I would consider 
RMS deviations of 0.2-0.3Å between protein loops not significant. However, 
movement of an atom of 0.2Å in the active site of an enzyme (e.g. with a 
transition state analog), especially when backed up with positive and negative 
difference electron density peaks, when the atom is forced in its original 
position, could be highly significant.

My 2 cts,
Herman

  


-Ursprüngliche Nachricht-
Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Tim 
Gruene
Gesendet: Sonntag, 23. Juni 2013 19:54
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] query regarding RMSD calculation

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear ansuman,

'rmsd' stands for 'root mean square deviation', i.e. in the context of your 
first question you must sum the _square_ of the atomwise deviations and then 
take the square root.

question two: I contradict: residues ARE static in this context: a 
crystallographic structure model corresponds to the average of all asymmetric 
units in the crystal, hence the resulting coordinates represent an average 
value and therefore are to be considered static.
The fact that we assume that between two asymmetric units certain deviations 
from the average will occur is modelled by the (isotropic /
anisotropic) ADP and the occupancy, respectively, but nevertheless, the model 
that you use for calculating an rmsd is a static one. This may be the reason 
why there is not one answer to your question two, but may give rise to a 
longish discussion.

Best,
Tim

On 06/23/2013 02:50 PM, ansuman biswas wrote:

Dear all, I have 2 queries regarding RMSD calculation and
interpretation.

1. When 2 residue stretches are superposed using CCP4 superpose, the
log file shows the atomwise-deviations between matched residue pairs.
How to find the total deviation/RMSD between a specified residue pair
- should the RMSDs of the individual atoms be summed over, or is there
some other formula?

2. Residues are not static; this dynamic nature is evident from
variations (for a given residue; by variations, I mean that the
residues don't superpose completely) when multiple chains of the same
structure (from multiple chains in AU of a PDB, or from different
PDBs) are superposed. My question is: is there an RMSD cut-off below
which the variation can be considered as thermal fluctutation, and
above which they can be said to be a different 'conformation' or
'state'?

Thanking all, regards, ansuman


- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRxzZMUxlJ7aRr7hoRAog8AKDistu4ryi4DMUjgHlxXNl0wLV48ACcD1PI
OYiO3+oZTZ3RoBjNKY9GR/I=
=zyHZ
-END PGP SIGNATURE-


[ccp4bb] Postdoctoral Position at Univ. Mass Medical School

2013-06-24 Thread Brian Kelch
POSTDOCTORAL POSITION

KELCH LAB

UNIVERSITY OF MASSACHUSETTS, MEDICAL SCHOOL

Crystallographic studies of DNA replication machines



A postdoctoral position to understand the mechanism and structures of DNA
replication machines is available in the lab of Professor Brian Kelch at
the University of Massachusetts, Medical School in Worcester, MA. The Kelch
Lab merges structural, biochemical, biophysical and computational
approaches to dissect the mechanism of multi-protein assemblies involved in
DNA replication and DNA metabolism. More information is available at the
lab website:

http://labs.umassmed.edu/kelchlab/Lab_Website/Welcome.html



The position offers ideal opportunities for experienced crystallographers
interested in continuing structural studies, but who would like to
complement his or her expertise with other, diverse tools for understanding
the molecular basis for DNA replication and general ATPase mechanisms. The
fellow will benefit from both the multidisciplinary environment in the lab
and the highly collaborative UMMS community. The lab has extensive
crystallographic resources, including (as part of the UMMS crystallography
group) a new MicroMax-007-HF X-ray generator, a Saturn 944 HG CCD detector
and a Phenix crystallization robot.



Candidates should have (or expect) a Ph.D. or M.D. degree. Candidates with
experience in protein purification, crystallization, and structure
determination are strongly preferred. Applications will be reviewed as they
are received; start date is preferred by Fall 2013, but is negotiable.



Interested individuals should send a single PDF file containing their CV
along with a summary of previous research experience, accomplishments, and
expertise to Prof. Brian Kelch at:

brian.ke...@umassmed.edu



Worcester is located less than 40 miles from Boston/Cambridge, and the area
has multiple career development opportunities for those interested in
careers in academia or industry. Moreover, the Worcester area is family
oriented, with numerous outdoor activities, some of the best public schools
in the nation, and a low cost of living.


Re: [ccp4bb] Alternating positive and negative density

2013-06-24 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello Peter,

have you tried removing a few residues involved and (after a round of
refinement) rebuild the area from scratch (a simple way to remove
model bias)?
How did you decide about the resolution cut-off of your data sets -
could it be that there is noise and that it might be worth cutting a
little more during integration? Although I agree the difference map
seems to have systematic features that do not quite support this
suggestion.

A side note: you are most likely not looking at 2Fo-Fc and Fo-Fc maps,
but a sigma-A weighted maps and sigma-A weighted difference maps. I
think it is worth differentiating between these terms.

Second note: thumbs up for the small size of your attachments. I think
this is a good compromise between something quite impossible to
describe and people who do not like large attachments.

Regards,
Tim

On 06/24/2013 06:57 AM, Peter Randolph wrote:
> Short version: Hi, I'm working on what should be a straightforward
> molecular replacement problem (already solved protein in new space
> group), but my Fo-Fc map contains a peculiar series of alternating
> positive and negative peaks of difference density. I'm wondering if
> anyone has anyone seen this before? Sample images are attached and
> more background is below.
> 
> More background: I had initially solved an *apo* structure of my
> protein (from previous diffraction data in another crystal form),
> and more recently collected diffraction data for crystals of the
> protein co-crystallized with potential binding partners (small
> RNAs). All the datasets I've processed so far have the same
> spacegroup (P2(1)2(1)2(1)) and cell dimensions as the apo 
> structure.
> 
> I have tried two general approaches, both with the same initial
> steps of indexing / integrating / scaling in XDS, converting to MTZ
> format without R-free flags, then importing R-free-flags from the
> (previous) apo structure's MTZ.  I would then run "phenix.refine"
> for initial rigid-body refinement using the apo-model and the new
> mtz to see if there were signs of any new positive density
> corresponding to bound ligands. While the 2Fo-Fc map fits the apo
> protein 3D model perfectly, the Fo-Fc map shows bands of
> alternating positive and negative density running throughout the 
> structure.  What's odd is that these 'bands' appear to be
> systematic rather than random (please see attached image), and
> aren't located anywhere that a binding partner could bind, leading
> me to suspect they may be artefactual (these bands actually run
> through the body of the protein, so one possibility is that the
> b-strands are off-register by a multiple of a peptide unit?). If I
> use the same mtz file and structural model, and instead do
> molecular replacement with phaser, I see the same issue.  I've 
> tried this workflow with a couple of datasets and using P222 as
> well as P2(1)2(1)2(1), and each time I see the same issue of
> spurious(?) bands. Any help or advice would be much appreciated,
> especially if anyone has seen anything like this?
> 
> Thanks a lot, Peter Randolph
> 

- -- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRyDLSUxlJ7aRr7hoRAvzCAKCER6EKt6GVsiXEpLx1GjYDNWKY/gCfRrKv
LCo7f33FHgeevC9jo7m/kaw=
=OjWN
-END PGP SIGNATURE-


Re: [ccp4bb] Refinement against frames

2013-06-24 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear John,

actually I am not a friend of this idea. Processing software make an
excellent job of removing the instrumental part from our data. If we
start to integrate against frames, the next structural title might be
something like "Crystal structure of ABC a xA resolution measured at
beamline xyz with a frame width of f degrees and a total rotation
range of phi degreees..." the point I am trying to make: once
integrating against frames one may have to take a lot of issues into
account for interpreting the structure.
And do you think that refining against frames will actually give
greater chemical or biological insight into the sample, or will it
only give a more accurate description of the crystal contents? These
are two different things and the latter is - in my opinion - not what
structures are about.

Best, Tim

P.S.: I changed the subject line, because the thread based sorting of
my emails is soon going to exceed the width of my screem for the
original one.

On 06/24/2013 08:13 AM, Jrh wrote:
> Dear Tom, I find this suggestion of using the full images an
> excellent and visionary one. So, how to implement it? We are part
> way along the path with James Holton's reverse Mosflm. The computer
> memory challenge could be ameliorated by simple pixel averaging at
> least initially. The diffuse scattering would be the ultimate gold
> at the end of the rainbow. Peter Moore's new book, inter alia,
> carries many splendid insights into the diffuse scattering in our
> diffraction patterns. Fullprof analyses have become a firm trend in
> other fields, admittedly with simpler computing overheads. 
> Greetings, John
> 
> Prof John R Helliwell DSc FInstP
> 
> 
> 
> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
>  wrote:
> 
>> I hope I am not duplicating too much of this fascinating
>> discussion with these comments:  perhaps the main reason there is
>> confusion about what to do is that neither F nor I is really the
>> most suitable thing to use in refinement.  As pointed out several
>> times in different ways, we don't measure F or I, we only measure
>> counts on a detector.  As a convenience, we "process" our
>> diffraction images to estimate I or F and their uncertainties and
>> model these uncertainties as simple functions (e.g., a Gaussian).
>> There is no need in principle to do that, and if we were to
>> refine instead against the raw image data these issues about
>> positivity would disappear and our structures might even be a
>> little better.
>> 
>> Our standard procedure is to estimate F or I from counts on the
>> detector, then to use these estimates of F or I in refinement.
>> This is not so easy to do right because F or I contain many terms
>> coming from many pixels and it is hard to model their statistics
>> in detail.  Further, attempts we make to estimate either F or I
>> as physically plausible values (e.g., using the fact that they
>> are not negative) will generally be biased (the values after
>> correction will generally be systematically low or systematically
>> high, as is true for the French and Wilson correction and as
>> would be true for the truncation of I at zero or above).
>> 
>> Randy's method for intensity refinement is an improvement because
>> the statistics are treated more fully than just using an estimate
>> of F or I and assuming its uncertainty has a simple distribution.
>> So why not avoid all the problems with modeling the statistics of
>> processed data and instead refine against the raw data.  From the
>> structural model you calculate F, from F and a detailed model of
>> the experiment (the same model that is currently used in data
>> processing) you calculate the counts expected on each pixel. Then
>> you calculate the likelihood of the data given your models of the
>> structure and of the experiment.  This would have lots of
>> benefits because it would allow improved descriptions of the
>> experiment (decay, absorption, detector sensitivity, diffuse
>> scattering and other "background" on the images,on and on)
>> that could lead to more accurate structures in the end.  Of
>> course there are some minor issues about putting all this in
>> computer memory for refinement
>> 
>> -Tom T  From: CCP4
>> bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Phil
>> [p...@mrc-lmb.cam.ac.uk] Sent: Friday, June 21, 2013 2:50 PM To:
>> CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] ctruncate bug?
>> 
>> However you decide to argue the point, you must consider _all_
>> the observations of a reflection (replicates and symmetry
>> related) together when you infer Itrue or F etc, otherwise you
>> will bias the result even more. Thus you cannot (easily) do it
>> during integration
>> 
>> Phil
>> 
>> Sent from my iPad
>> 
>> On 21 Jun 2013, at 20:30, Douglas Theobald
>>  wrote:
>> 
>>> On Jun 21, 2013, at 2:48 PM, Ed Pozharski
>>>  wrote:
>>> 
 Douglas,
>> Observed intensities are th

Re: [ccp4bb] Refinement against frames

2013-06-24 Thread Boaz Shaanan
Hi Tim,

I agree with you.  Another point to remember about this issue of pixel->F's  
(or I's) conversion is that small molecule crystallographers take the same 
route and produce structures with 1-2% R-factors, so this conversion is hardly 
our problem. The main culprit in the issues that have been discussed so lucidly 
on the BB recently have mostly to do with the vast amount of weak reflections 
in diffraction patterns of macromolecules (and how to decide on resolution in 
such situations). Digging into the peak/background pixels and signal/noise 
ratio there is just going to open another Pandora box. 

My 2p thoughts.

 Cheers,

 Boaz 


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710






From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Tim Gruene 
[t...@shelx.uni-ac.gwdg.de]
Sent: Monday, June 24, 2013 2:59 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Refinement against frames

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear John,

actually I am not a friend of this idea. Processing software make an
excellent job of removing the instrumental part from our data. If we
start to integrate against frames, the next structural title might be
something like "Crystal structure of ABC a xA resolution measured at
beamline xyz with a frame width of f degrees and a total rotation
range of phi degreees..." the point I am trying to make: once
integrating against frames one may have to take a lot of issues into
account for interpreting the structure.
And do you think that refining against frames will actually give
greater chemical or biological insight into the sample, or will it
only give a more accurate description of the crystal contents? These
are two different things and the latter is - in my opinion - not what
structures are about.

Best, Tim

P.S.: I changed the subject line, because the thread based sorting of
my emails is soon going to exceed the width of my screem for the
original one.

On 06/24/2013 08:13 AM, Jrh wrote:
> Dear Tom, I find this suggestion of using the full images an
> excellent and visionary one. So, how to implement it? We are part
> way along the path with James Holton's reverse Mosflm. The computer
> memory challenge could be ameliorated by simple pixel averaging at
> least initially. The diffuse scattering would be the ultimate gold
> at the end of the rainbow. Peter Moore's new book, inter alia,
> carries many splendid insights into the diffuse scattering in our
> diffraction patterns. Fullprof analyses have become a firm trend in
> other fields, admittedly with simpler computing overheads.
> Greetings, John
>
> Prof John R Helliwell DSc FInstP
>
>
>
> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
>  wrote:
>
>> I hope I am not duplicating too much of this fascinating
>> discussion with these comments:  perhaps the main reason there is
>> confusion about what to do is that neither F nor I is really the
>> most suitable thing to use in refinement.  As pointed out several
>> times in different ways, we don't measure F or I, we only measure
>> counts on a detector.  As a convenience, we "process" our
>> diffraction images to estimate I or F and their uncertainties and
>> model these uncertainties as simple functions (e.g., a Gaussian).
>> There is no need in principle to do that, and if we were to
>> refine instead against the raw image data these issues about
>> positivity would disappear and our structures might even be a
>> little better.
>>
>> Our standard procedure is to estimate F or I from counts on the
>> detector, then to use these estimates of F or I in refinement.
>> This is not so easy to do right because F or I contain many terms
>> coming from many pixels and it is hard to model their statistics
>> in detail.  Further, attempts we make to estimate either F or I
>> as physically plausible values (e.g., using the fact that they
>> are not negative) will generally be biased (the values after
>> correction will generally be systematically low or systematically
>> high, as is true for the French and Wilson correction and as
>> would be true for the truncation of I at zero or above).
>>
>> Randy's method for intensity refinement is an improvement because
>> the statistics are treated more fully than just using an estimate
>> of F or I and assuming its uncertainty has a simple distribution.
>> So why not avoid all the problems with modeling the statistics of
>> processed data and instead refine against the raw data.  From the
>> structural model you calculate F, from F and a detailed model of
>> the experiment (the same model that is currently used in data
>> processing) you calculate the counts expected on each pixel. Then
>> you calculate the likelihood of the data given your models of the
>> structure and of the experiment

Re: [ccp4bb] Refinement against frames

2013-06-24 Thread Mark J van Raaij
Hi Tim,
I don't follow your point...frames are just data, and with more information 
than after integration. The data after integration is also to some extent 
dependent on the beamline.
It should indeed give a more accurate description of the crystal contents - 
whether that in turn will translate into greater chemical or biological insight 
(now or some time in the future) will depend on the specific case (and on the 
interpreter).
Mark
Mark J van Raaij
Lab 20B
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij





On 24 Jun 2013, at 13:59, Tim Gruene wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Dear John,
> 
> actually I am not a friend of this idea. Processing software make an
> excellent job of removing the instrumental part from our data. If we
> start to integrate against frames, the next structural title might be
> something like "Crystal structure of ABC a xA resolution measured at
> beamline xyz with a frame width of f degrees and a total rotation
> range of phi degreees..." the point I am trying to make: once
> integrating against frames one may have to take a lot of issues into
> account for interpreting the structure.
> And do you think that refining against frames will actually give
> greater chemical or biological insight into the sample, or will it
> only give a more accurate description of the crystal contents? These
> are two different things and the latter is - in my opinion - not what
> structures are about.
> 
> Best, Tim
> 
> P.S.: I changed the subject line, because the thread based sorting of
> my emails is soon going to exceed the width of my screem for the
> original one.
> 
> On 06/24/2013 08:13 AM, Jrh wrote:
>> Dear Tom, I find this suggestion of using the full images an
>> excellent and visionary one. So, how to implement it? We are part
>> way along the path with James Holton's reverse Mosflm. The computer
>> memory challenge could be ameliorated by simple pixel averaging at
>> least initially. The diffuse scattering would be the ultimate gold
>> at the end of the rainbow. Peter Moore's new book, inter alia,
>> carries many splendid insights into the diffuse scattering in our
>> diffraction patterns. Fullprof analyses have become a firm trend in
>> other fields, admittedly with simpler computing overheads. 
>> Greetings, John
>> 
>> Prof John R Helliwell DSc FInstP
>> 
>> 
>> 
>> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
>>  wrote:
>> 
>>> I hope I am not duplicating too much of this fascinating
>>> discussion with these comments:  perhaps the main reason there is
>>> confusion about what to do is that neither F nor I is really the
>>> most suitable thing to use in refinement.  As pointed out several
>>> times in different ways, we don't measure F or I, we only measure
>>> counts on a detector.  As a convenience, we "process" our
>>> diffraction images to estimate I or F and their uncertainties and
>>> model these uncertainties as simple functions (e.g., a Gaussian).
>>> There is no need in principle to do that, and if we were to
>>> refine instead against the raw image data these issues about
>>> positivity would disappear and our structures might even be a
>>> little better.
>>> 
>>> Our standard procedure is to estimate F or I from counts on the
>>> detector, then to use these estimates of F or I in refinement.
>>> This is not so easy to do right because F or I contain many terms
>>> coming from many pixels and it is hard to model their statistics
>>> in detail.  Further, attempts we make to estimate either F or I
>>> as physically plausible values (e.g., using the fact that they
>>> are not negative) will generally be biased (the values after
>>> correction will generally be systematically low or systematically
>>> high, as is true for the French and Wilson correction and as
>>> would be true for the truncation of I at zero or above).
>>> 
>>> Randy's method for intensity refinement is an improvement because
>>> the statistics are treated more fully than just using an estimate
>>> of F or I and assuming its uncertainty has a simple distribution.
>>> So why not avoid all the problems with modeling the statistics of
>>> processed data and instead refine against the raw data.  From the
>>> structural model you calculate F, from F and a detailed model of
>>> the experiment (the same model that is currently used in data
>>> processing) you calculate the counts expected on each pixel. Then
>>> you calculate the likelihood of the data given your models of the
>>> structure and of the experiment.  This would have lots of
>>> benefits because it would allow improved descriptions of the
>>> experiment (decay, absorption, detector sensitivity, diffuse
>>> scattering and other "background" on the images,on and on)
>>> that could lead to more accurate structures in the end.  Of
>>> course there are some minor issues about putting all this 

[ccp4bb] High Rwork/Rfree values

2013-06-24 Thread Haiying
Hi, all

I have encountered the high Rwork/Rfree values. Here is the story:

The glycosylated native protein structure is solved in the P212121 with unit 
cell parameters of 72.6, 78.0, and 112.5 and the all solution statistics are 
perfectly fine. I am trying to crystallize and solve the structure of its 
deglycosylated version. The deglycosylated protein crystallizes in the same 
morphology as glycosylated one. I expect both will share in the same space 
group with relatively similar unit cell parameters.  Surprisingly the 
deglycosylated one has the unit cell parameter of 66.5, 70.5, and 137.0 (P21212 
space group).

These deglycosylated crystals diffract weakly but to 2.2A for the longer 
exposure time.  At certain angles diffraction spots are streaky, although at 
the most of angles they are ok.  I have processed the data in HKL2000, imosflm, 
and xds, which all suggested the P21212 space group (66.5, 70.5, and 137.0).  
The phaser suggests a solution at P22121, so the REINDEX is used to transform 
P21212 to P22121.  However, after the first round of Refmac refinement, the 
Rwork/Rfree values are huge (0.35/0.41) and can’t be reduced further. The 
fitting of electron density map looks ok.  I suspect the obtained crystals 
quality and resulting processed statistics is the reason for the observed high 
Rwork/Rfree values.  Are there any suggestions?

I have noticed that the unit cell volumes of both glycosylated (637065 Å3) and 
deglycosylated (642290 Å3) proteins are very similar. Is there a way I can 
transform one to the other?  Thank you.   

Best, 
Haiying



[ccp4bb] Gnuplot: how to plot with resolution values as labels on x-axis?

2013-06-24 Thread Kay Diederichs

Dear Gnuplot users,

you all know the crystallographic tables which have a column of 
resolution values, and columns of crystallographic indicators (R, 
I/sigma, ... whatever).
Assuming that I want to plot the indicator in column 2 as a function of 
resolution, I can simply say

> plot 'table.dat' us 2
but the problem is now that I would like to have the resolution values 
as labels, so instead of 0 1 2 3 4 5 ... I would like to have 30.6 5.72 
3.90 3.17 2.64 ... or so.
Furthermore these labels might be fairly wide, so I would like to rotate 
them, by (say) 30° or even 90°.
In the past, I seem to remember that I have manually positioned the 
labels, as individual text strings. This can be done for a single plot 
... but then again, we live in the 3rd millenium and there must be a 
better way.
Can Gnuplot take the labels from the file and put them into the right 
place? Could anyone please share the Gnuplot magic for doing so?


thanks,

Kay
--
Kay Diederichshttp://strucbio.biologie.uni-konstanz.de
email: kay.diederi...@uni-konstanz.deTel +49 7531 88 4049 Fax 3183
Fachbereich Biologie, Universität Konstanz, Box M647, D-78457 Konstanz

This e-mail is digitally signed. If your e-mail client does not have the
necessary capabilities, just ignore the attached signature "smime.p7s".



smime.p7s
Description: S/MIME Cryptographic Signature


[ccp4bb] High Rwork/Rfree values

2013-06-24 Thread Haiying Bie
Hi, all

I have encountered the high Rwork/Rfree values. Here is the story:

The glycosylated native protein structure is solved in the P212121
with unit cell parameters of 72.6, 78.0, and 112.5 and the all
solution statistics are perfectly fine. I am trying to crystallize and
solve the structure of its deglycosylated version. The deglycosylated
protein crystallizes in the same morphology as glycosylated one. I
expect both will share in the same space group with relatively similar
unit cell parameters.  Surprisingly the deglycosylated one has the
unit cell parameter of 66.5, 70.5, and 137.0 (P21212 space group).

These deglycosylated crystals diffract weakly but to 2.2A for the
longer exposure time.  At certain angles diffraction spots are
streaky, although at the most of angles they are ok.  I have processed
the data in HKL2000, imosflm, and xds, which all suggested the P21212
space group (66.5, 70.5, and 137.0).  The phaser suggests a solution
at P22121, so the REINDEX is used to transform P21212 to P22121.
However, after the first round of Refmac refinement, the Rwork/Rfree
values are huge (0.35/0.41) and can’t be reduced further. The fitting
of electron density map looks ok.  I suspect the obtained crystals
quality and resulting processed statistics is the reason for the
observed high Rwork/Rfree values.  Are there any suggestions?

I have noticed that the unit cell volumes of both glycosylated (637065
Å3) and deglycosylated (642290 Å3) proteins are very similar. Is there
a way I can transform one to the other?  Thank you.
Best,
Haiying


Re: [ccp4bb] Gnuplot: how to plot with resolution values as labels on x-axis?

2013-06-24 Thread Kumar, Abhinav
Hi Kay,
This may be helpful, assuming your data file has resolution in column 3.

set xtics rotate by -30
plot 'file' u 1:2:xticlabels(3)

Make sure you have a recent version of gnuplot.

Thanks,
Abhinav

JCSG@SSRL, SLAC
(650) 926-2992





On Jun 24, 2013, at 7:54 AM, Kay Diederichs wrote:

Dear Gnuplot users,

you all know the crystallographic tables which have a column of resolution 
values, and columns of crystallographic indicators (R, I/sigma, ... whatever).
Assuming that I want to plot the indicator in column 2 as a function of 
resolution, I can simply say
> plot 'table.dat' us 2
but the problem is now that I would like to have the resolution values as 
labels, so instead of 0 1 2 3 4 5 ... I would like to have 30.6 5.72 3.90 3.17 
2.64 ... or so.
Furthermore these labels might be fairly wide, so I would like to rotate them, 
by (say) 30° or even 90°.
In the past, I seem to remember that I have manually positioned the labels, as 
individual text strings. This can be done for a single plot ... but then again, 
we live in the 3rd millenium and there must be a better way.
Can Gnuplot take the labels from the file and put them into the right place? 
Could anyone please share the Gnuplot magic for doing so?

thanks,

Kay
--
Kay Diederichshttp://strucbio.biologie.uni-konstanz.de
email: kay.diederi...@uni-konstanz.de
Tel +49 7531 88 4049 Fax 3183
Fachbereich Biologie, Universität Konstanz, Box M647, D-78457 Konstanz

This e-mail is digitally signed. If your e-mail client does not have the
necessary capabilities, just ignore the attached signature "smime.p7s".


Re: [ccp4bb] High Rwork/Rfree values

2013-06-24 Thread Phil Jeffrey

Haiying,

As far as I can tell you've got a successful solution in molecular 
replacement via Phaser and then gone and refined it in the wrong space 
group.


Based on what you've told us:  you took your initial data in primitive 
orthorhombic and solved for the structure in Phaser while sampling all 
possible space groups.  Phaser is telling you that your *original* data 
indexing is truly space group P22(1)2(1) and if you take that m.r. 
solution/data combination and simple *assign* the space group it should 
work in Refmac.  In fact Phaser should have written the correct space 
group in the PDB file header.


If you refine your original MTZ native data file with the PDB file 
Phaser wrote, what do you get ?


You seem to have reindexed the data but not rotated the model (or re-run 
molecular replacement).  That makes the model and data out-of-sync. 
Phaser does not reindex the data internally, and that's why it tries 
eight space groups in primitive orthorhombic rather than just the 
minimal set P222, P222(1), P2(1)2(1)2, P2(1)2(1)2(1).  The others that 
it tries are alternative settings of these space groups (where appropriate).


If you want to refine in P2(1)2(1)2 then reindex the data (h,k,l) -> 
(k,l,h) and re-run molecular replacement with the reindexed MTZ file.



If the above is a misinterpretation of what you wrote, my alternative 
advice on this is:


1.  throw the thing at Arp/wArp and look hard at the maps you get out. 
The structure might have changed more than you thought.
2.  rescale the data in P1 and put it into Pointless and/or Xtriage to 
check for twinning and point group assignment
3.  I'm fairly sure that the (72.6, 78.0, 112.5) and (66.5, 70.5, 137.0) 
cells are unrelated but #2 will show that.
4.  If all else fails solve it in P1 and find the space group "by 
inspection" afterwards


Phil Jeffrey
Princeton


Re: [ccp4bb] ctruncate bug?

2013-06-24 Thread Terwilliger, Thomas C
Implementing refinement against images will be pretty challenging.  As far as I 
know the problem isn't in saying what has to happen, but rather in the enormous 
amount of bookkeeping necessary to relate a model of a structure and a model of 
the entire experiment (including such details as parameters defining spot 
shape, absorption etc) to a very long list of counts on pixels...and to 
calculate derivatives so as to optimize likelihood.   As you suggest, there 
could be payoff in modeling diffuse scattering.  Also I imagine that the 
structure factors could be estimated more accurately by refining against the 
raw images.  

One question will be whether all this would make a lot of difference with 
today's models. My guess is it won't make a substantial difference in most 
cases because our biggest problem is the inadequacy of these models and not 
deficiencies in our analysis of the data. However there might be some cases 
where it could help.  The bigger question is whether it will make a difference 
in the future when we have more advanced models that have the potential to 
explain the data better. I think that yes, at that point all the effort will be 
worth it.

Tom T

From: Jrh [jrhelliw...@gmail.com]
Sent: Monday, June 24, 2013 12:13 AM
To: Terwilliger, Thomas C
Cc: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] ctruncate bug?

Dear Tom,
I find this suggestion of using the full images an excellent and visionary one.
So, how to implement it?
We are part way along the path with James Holton's reverse Mosflm.
The computer memory challenge could be ameliorated by simple pixel averaging at 
least initially.
The diffuse scattering would be the ultimate gold at the end of the rainbow. 
Peter Moore's new book, inter alia, carries many splendid insights into the 
diffuse scattering in our diffraction patterns.
Fullprof analyses have become a firm trend in other fields, admittedly with 
simpler computing overheads.
Greetings,
John

Prof John R Helliwell DSc FInstP



On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"  wrote:

> I hope I am not duplicating too much of this fascinating discussion with 
> these comments:  perhaps the main reason there is confusion about what to do 
> is that neither F nor I is really the most suitable thing to use in 
> refinement.  As pointed out several times in different ways, we don't measure 
> F or I, we only measure counts on a detector.  As a convenience, we "process" 
> our diffraction images to estimate I or F and their uncertainties and model 
> these uncertainties as simple functions (e.g., a Gaussian).  There is no need 
> in principle to do that, and if we were to refine instead against the raw 
> image data these issues about positivity would disappear and our structures 
> might even be a little better.
>
> Our standard procedure is to estimate F or I from counts on the detector, 
> then to use these estimates of F or I in refinement.  This is not so easy to 
> do right because F or I contain many terms coming from many pixels and it is 
> hard to model their statistics in detail.  Further, attempts we make to 
> estimate either F or I as physically plausible values (e.g., using the fact 
> that they are not negative) will generally be biased (the values after 
> correction will generally be systematically low or systematically high, as is 
> true for the French and Wilson correction and as would be true for the 
> truncation of I at zero or above).
>
> Randy's method for intensity refinement is an improvement because the 
> statistics are treated more fully than just using an estimate of F or I and 
> assuming its uncertainty has a simple distribution.  So why not avoid all the 
> problems with modeling the statistics of processed data and instead refine 
> against the raw data.  From the structural model you calculate F, from F and 
> a detailed model of the experiment (the same model that is currently used in 
> data processing) you calculate the counts expected on each pixel. Then you 
> calculate the likelihood of the data given your models of the structure and 
> of the experiment.  This would have lots of benefits because it would allow 
> improved descriptions of the experiment (decay, absorption, detector 
> sensitivity, diffuse scattering and other "background" on the images,on 
> and on) that could lead to more accurate structures in the end.  Of course 
> there are some minor issues about putting all this in computer memory for 
> refinement
>
> -Tom T
> 
> From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Phil 
> [p...@mrc-lmb.cam.ac.uk]
> Sent: Friday, June 21, 2013 2:50 PM
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] ctruncate bug?
>
> However you decide to argue the point, you must consider _all_ the 
> observations of a reflection (replicates and symmetry related) together when 
> you infer Itrue or F etc, otherwise you will bias the result e

Re: [ccp4bb] Gnuplot: how to plot with resolution values as labels on x-axis?

2013-06-24 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello Kay,

the rotation can be achieved with 'set xtics rotate by 30'

I did not know about xticlabels, suggested by Abhinav - very useful!

Best,
Tim

On 06/24/2013 04:54 PM, Kay Diederichs wrote:
> Dear Gnuplot users,
> 
> you all know the crystallographic tables which have a column of 
> resolution values, and columns of crystallographic indicators (R, 
> I/sigma, ... whatever). Assuming that I want to plot the indicator
> in column 2 as a function of resolution, I can simply say
>> plot 'table.dat' us 2
> but the problem is now that I would like to have the resolution
> values as labels, so instead of 0 1 2 3 4 5 ... I would like to
> have 30.6 5.72 3.90 3.17 2.64 ... or so. Furthermore these labels
> might be fairly wide, so I would like to rotate them, by (say) 30°
> or even 90°. In the past, I seem to remember that I have manually
> positioned the labels, as individual text strings. This can be done
> for a single plot ... but then again, we live in the 3rd millenium
> and there must be a better way. Can Gnuplot take the labels from
> the file and put them into the right place? Could anyone please
> share the Gnuplot magic for doing so?
> 
> thanks,
> 
> Kay

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRyGeKUxlJ7aRr7hoRAvDzAKDsFX8Hj8FiQlulx8LEW9i9IDR//QCg7Fl5
JMcIJYnPpl/3dQNyGQfKhX0=
=vgxc
-END PGP SIGNATURE-


Re: [ccp4bb] Refinement against frames

2013-06-24 Thread Jrh
Dear Tim,
With a full interpretation of the diffuse scattering as well how about papers 
becoming entitled:-
The structure and dynamics of enzyme X
As you intimate some diffuse scattering is crystal dependent ie phonons 
derived. Other aspects are however not correlated over multiple unit cells but 
thereby largely related to the dynamics of our macromolecules. (largely means 
we need to allow for static disorder and / or chemical variants possibilities). 

Re instrument aspects:-
The days of instrument setting dependent (eg due to varying magnetic fields as 
we changed xtod) detector response are indeed fortunately behind us. That said 
we might learn something on the instrument aspect doing things against the 
detector plane. Another aspect for example is detailed prediction of spot 
shape, although perhaps for the purists amongst us (eg see Greenhough, 
Helliwell and Rule 1983 JAC), but may add insights into I - I bg, ie the spot 
shape prior can be known. This can be done processing 'forwards or backwards'. 

Greetings,
John

Prof John R Helliwell DSc 
 
 

On 24 Jun 2013, at 12:59, Tim Gruene  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Dear John,
> 
> actually I am not a friend of this idea. Processing software make an
> excellent job of removing the instrumental part from our data. If we
> start to integrate against frames, the next structural title might be
> something like "Crystal structure of ABC a xA resolution measured at
> beamline xyz with a frame width of f degrees and a total rotation
> range of phi degreees..." the point I am trying to make: once
> integrating against frames one may have to take a lot of issues into
> account for interpreting the structure.
> And do you think that refining against frames will actually give
> greater chemical or biological insight into the sample, or will it
> only give a more accurate description of the crystal contents? These
> are two different things and the latter is - in my opinion - not what
> structures are about.
> 
> Best, Tim
> 
> P.S.: I changed the subject line, because the thread based sorting of
> my emails is soon going to exceed the width of my screem for the
> original one.
> 
> On 06/24/2013 08:13 AM, Jrh wrote:
>> Dear Tom, I find this suggestion of using the full images an
>> excellent and visionary one. So, how to implement it? We are part
>> way along the path with James Holton's reverse Mosflm. The computer
>> memory challenge could be ameliorated by simple pixel averaging at
>> least initially. The diffuse scattering would be the ultimate gold
>> at the end of the rainbow. Peter Moore's new book, inter alia,
>> carries many splendid insights into the diffuse scattering in our
>> diffraction patterns. Fullprof analyses have become a firm trend in
>> other fields, admittedly with simpler computing overheads. 
>> Greetings, John
>> 
>> Prof John R Helliwell DSc FInstP
>> 
>> 
>> 
>> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"
>>  wrote:
>> 
>>> I hope I am not duplicating too much of this fascinating
>>> discussion with these comments:  perhaps the main reason there is
>>> confusion about what to do is that neither F nor I is really the
>>> most suitable thing to use in refinement.  As pointed out several
>>> times in different ways, we don't measure F or I, we only measure
>>> counts on a detector.  As a convenience, we "process" our
>>> diffraction images to estimate I or F and their uncertainties and
>>> model these uncertainties as simple functions (e.g., a Gaussian).
>>> There is no need in principle to do that, and if we were to
>>> refine instead against the raw image data these issues about
>>> positivity would disappear and our structures might even be a
>>> little better.
>>> 
>>> Our standard procedure is to estimate F or I from counts on the
>>> detector, then to use these estimates of F or I in refinement.
>>> This is not so easy to do right because F or I contain many terms
>>> coming from many pixels and it is hard to model their statistics
>>> in detail.  Further, attempts we make to estimate either F or I
>>> as physically plausible values (e.g., using the fact that they
>>> are not negative) will generally be biased (the values after
>>> correction will generally be systematically low or systematically
>>> high, as is true for the French and Wilson correction and as
>>> would be true for the truncation of I at zero or above).
>>> 
>>> Randy's method for intensity refinement is an improvement because
>>> the statistics are treated more fully than just using an estimate
>>> of F or I and assuming its uncertainty has a simple distribution.
>>> So why not avoid all the problems with modeling the statistics of
>>> processed data and instead refine against the raw data.  From the
>>> structural model you calculate F, from F and a detailed model of
>>> the experiment (the same model that is currently used in data
>>> processing) you calculate the counts expected on each pixel. Then
>>> you calculate

Re: [ccp4bb] ctruncate bug?

2013-06-24 Thread Pavel Afonine
Refinement against images is a nice old idea.
>From refinement technical point of view it's going to be challenging.
Refining just two flat bulk solvent model ksol&Bsol simultaneously may be
tricky, or occupancy + individual B-factor + TLS, or ask multipolar
refinement folk about whole slew of magic they use to refine different
multipolar parameters at different stages of refinement proces and in
different order and applied to different atom types (H vs non-H)
...etc...etc. Now if you convolute all this with the whole diffraction
experiment parameters through using images in refinement that will be big
fun, I'm sure.
Pavel



On Sun, Jun 23, 2013 at 11:13 PM, Jrh  wrote:

> Dear Tom,
> I find this suggestion of using the full images an excellent and visionary
> one.
> So, how to implement it?
> We are part way along the path with James Holton's reverse Mosflm.
> The computer memory challenge could be ameliorated by simple pixel
> averaging at least initially.
> The diffuse scattering would be the ultimate gold at the end of the
> rainbow. Peter Moore's new book, inter alia, carries many splendid insights
> into the diffuse scattering in our diffraction patterns.
> Fullprof analyses have become a firm trend in other fields, admittedly
> with simpler computing overheads.
> Greetings,
> John
>
> Prof John R Helliwell DSc FInstP
>
>
>
> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C" 
> wrote:
>
> > I hope I am not duplicating too much of this fascinating discussion with
> these comments:  perhaps the main reason there is confusion about what to
> do is that neither F nor I is really the most suitable thing to use in
> refinement.  As pointed out several times in different ways, we don't
> measure F or I, we only measure counts on a detector.  As a convenience, we
> "process" our diffraction images to estimate I or F and their uncertainties
> and model these uncertainties as simple functions (e.g., a Gaussian).
>  There is no need in principle to do that, and if we were to refine instead
> against the raw image data these issues about positivity would disappear
> and our structures might even be a little better.
> >
> > Our standard procedure is to estimate F or I from counts on the
> detector, then to use these estimates of F or I in refinement.  This is not
> so easy to do right because F or I contain many terms coming from many
> pixels and it is hard to model their statistics in detail.  Further,
> attempts we make to estimate either F or I as physically plausible values
> (e.g., using the fact that they are not negative) will generally be biased
> (the values after correction will generally be systematically low or
> systematically high, as is true for the French and Wilson correction and as
> would be true for the truncation of I at zero or above).
> >
> > Randy's method for intensity refinement is an improvement because the
> statistics are treated more fully than just using an estimate of F or I and
> assuming its uncertainty has a simple distribution.  So why not avoid all
> the problems with modeling the statistics of processed data and instead
> refine against the raw data.  From the structural model you calculate F,
> from F and a detailed model of the experiment (the same model that is
> currently used in data processing) you calculate the counts expected on
> each pixel. Then you calculate the likelihood of the data given your models
> of the structure and of the experiment.  This would have lots of benefits
> because it would allow improved descriptions of the experiment (decay,
> absorption, detector sensitivity, diffuse scattering and other "background"
> on the images,on and on) that could lead to more accurate structures in
> the end.  Of course there are some minor issues about putting all this in
> computer memory for refinement
> >
> > -Tom T
> > 
> > From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Phil [
> p...@mrc-lmb.cam.ac.uk]
> > Sent: Friday, June 21, 2013 2:50 PM
> > To: CCP4BB@JISCMAIL.AC.UK
> > Subject: Re: [ccp4bb] ctruncate bug?
> >
> > However you decide to argue the point, you must consider _all_ the
> observations of a reflection (replicates and symmetry related) together
> when you infer Itrue or F etc, otherwise you will bias the result even
> more. Thus you cannot (easily) do it during integration
> >
> > Phil
> >
> > Sent from my iPad
> >
> > On 21 Jun 2013, at 20:30, Douglas Theobald 
> wrote:
> >
> >> On Jun 21, 2013, at 2:48 PM, Ed Pozharski 
> wrote:
> >>
> >>> Douglas,
> > Observed intensities are the best estimates that we can come up with
> in an experiment.
>  I also agree with this, and this is the clincher.  You are arguing
> that Ispot-Iback=Iobs is the best estimate we can come up with.  I claim
> that is absurd.  How are you quantifying "best"?  Usually we have some sort
> of discrepancy measure between true and estimate, like RMSD, mean absolute
> distance, log distance, or somesuch.  

Re: [ccp4bb] str solving problem

2013-06-24 Thread Eugene Valkov
Hi Pramod,

Can you post your merging statistics in different space groups, not just
log files from scaling? These are summarised nicely by Scala or Aimless.

Also, have you tried indexing from different subsets of images? Perhaps
there is a substantial contribution from a 'satellite' crystal in one
orientation or crystal will be less split? I've had cases where I could not
index properly if I had just used 0 and 90, but when I tried different
subsets of images it worked. This is very easy to do in iMosflm.

Andrew Leslie or other Mosflm developers, if they are reading this, might
well be interested in looking at your images as they are currently
interested in these kinds of problems with multiple lattices (see *Acta
Cryst.* (2013). D*69*, 1195-1203)

Eugene


On 21 June 2013 21:58, Pramod Kumar  wrote:

> Dear ...
>
> Francis
>
> Last I remember, HKL2000 bases its indexing on the 'strongest' spots on an
> image (though you could manually select spots). It could result in a
> misindex if the strongest spots come from separate lattices..
>
> I have used both HKL2000 and mosflm giving the same results (although I
> have used manual selection of spots as a trial but results are identical).
>
> Try a program that uses all spots for indexing, across all images (XDS for
> example) and you might get the true space group..
>
> I have given several efforts to the XDS but its giving error  "data image
> of particular no. does not exist (initially it was saying 11th image than i
> change image range then it says 21st and so on) *kindly check my data
> collection profile and XDS.INP* file in attachment'
>
>
> Or if the crystal is big enough, you could try shooting it in different
> areas and 'searching' for a better spot to collect data.
> Or 'grow a better crystal'.
>
> raising the crystals and struggle is on the peak...
>
>
> Dear Eugene
>
> plz find the attached scale log file, scaling table of mosflm
>
>
> When you index spots in Mosflm, do your predictions agree with the spots?
>
> plz see the snapshot of predicted spots..
>
>
>
> Dear Eleanor
> Yes both the molecule are visible in the ASU.
>
>
>
> Dear Pozharski
>
> Balbes pipeline hitting extremely high marks when fed into Phaser while
> being complete nonsense (it's a 150kDa multi-domain protein and resulting
> domain arrangement made absolutely no sense).  Refinement was stuck with
> high R-values and I sadly gave up on it for now.  I suspected that refmac
> step included in the pipeline artificially shifts the model so that it
> conforms to Patterson map better, which results in high score in Phaser.
>
> My domain arrangement is as expected, two molecules in ASU.
>
>
> thanks and regards
>
> pramod
>
>
>
>
>
>
>
>
>
> On Thu, Jun 20, 2013 at 3:50 PM, Eleanor Dodson  > wrote:
>
>> As others say - the Rfactors look pretty good for MR, mine usually start
>> over 50% even with a better model and one hopes they then decrease..
>> But you say you took the Balbes model into phaser? and I think Balbes
>> automatically runs cycles of refinement so any comment on R factors may not
>> mean much.
>>
>> Have you found both molecules in the asymmetric unit? You only give LLG
>> for one?
>> Eleanor
>>
>>
>>
>>
>> On 19 June 2013 17:44, Eugene Valkov  wrote:
>>
>>> Yes, I would agree with Francis that diffraction shows contribution from
>>> several lattices, which could lead to misindexing. However, it should be
>>> feasible to get a model that refines from this sort of data.
>>>
>>> Pramod - could you please post your data processing statistics from your
>>> scaling program? Better if you have several for different spacegroups.
>>>
>>> Also, I have no idea how HKL200 does this, but could you please provide
>>> an indexing solution table from Mosflm that shows penalties associated with
>>> each type of space group? Was there a sharp penalty drop at some point or
>>> was it more gradual?
>>>
>>> When you index spots in Mosflm, do your predictions agree with the
>>> spots? Or is there a substantial portion that are missed?
>>>
>>> I would consider altering thresholds in Mosflm for indexing (see the
>>> manual).
>>>
>>> Eugene
>>>
>>>
>>>
>>>
>>> On 19 June 2013 17:34, Francis E. Reyes wrote:
>>>
 On Jun 17, 2013, at 12:36 PM, Pramod Kumar  wrote:

 >> I have a crystal data diffracted  around 2.9 A*,
 >> during the data reduction HKL2000 not convincingly showed the space
 group (indexed in lower symmetry p1), while the mosflm given C-centered
 Orthorhombic, and again with little play around HKL2000 given CO
 >



 > no ice ring is appeared, diffraction pattern looks ok, misindexing in
 any direction is not conclusive to me (plz see the imj attachment)

 The diffraction does not look ok... there's hints of multiple
 lattices... which is not a problem if the two lattice orientations do not
 perfectly overlap (i.e. their spots are separable).

 Last I remember, HKL2000 bases its indexing on the 'strongest' sp

Re: [ccp4bb] Alternating positive and negative density

2013-06-24 Thread Dale Tronrud

   Based on eye-balling your map it looks to me that your grid spacing
is about 0.5 A.  The wavelength of your ripple is 4 grid spacings, and
the ripple is right along the x axis.  My guess is that you have a rogue
reflection with index of h00 where h is about 2 A resolution.

   How you are getting this in multiple data sets is a mystery to me,
but I would concentrate on finding that reflection and figuring out
why it is anomalously large.  Start with the Fourier coefficients that
went into calculating this map to find the exact value of h causing the
problem and then track that reflection back through your Fcalc's and
Fobs's.

Dale Tronrud

On 06/23/2013 09:57 PM, Peter Randolph wrote:

Short version:
Hi, I'm working on what should be a straightforward molecular
replacement problem (already solved protein in new space group), but my
Fo-Fc map contains a peculiar series of alternating positive and
negative peaks of difference density. I'm wondering if anyone has anyone
seen this before? Sample images are attached and more background is below.

More background:
I had initially solved an /apo/ structure of my protein (from previous
diffraction data in another crystal form), and more recently collected
diffraction data for crystals of the protein co-crystallized with
potential binding partners (small RNAs). All the datasets I've processed
so far have the same spacegroup (P2(1)2(1)2(1)) and cell dimensions as
the apo structure.

I have tried two general approaches, both with the same initial steps of
indexing / integrating / scaling in XDS, converting to MTZ format
without R-free flags, then importing R-free-flags from the (previous)
apo structure's MTZ.  I would then run "phenix.refine" for initial
rigid-body refinement using the apo-model and the new mtz to see if
there were signs of any new positive density corresponding to bound
ligands. While the 2Fo-Fc map fits the apo protein 3D model perfectly,
the Fo-Fc map shows bands of alternating positive and negative density
running throughout the structure.  What's odd is that these 'bands'
appear to be systematic rather than random (please see attached image),
and aren't located anywhere that a binding partner could bind, leading
me to suspect they may be artefactual (these bands actually run through
the body of the protein, so one possibility is that the b-strands are
off-register by a multiple of a peptide unit?). If I use the same mtz
file and structural model, and instead do molecular replacement with
phaser, I see the same issue.  I've tried this workflow with a couple of
datasets and using P222 as well as P2(1)2(1)2(1), and each time I see
the same issue of spurious(?) bands. Any help or advice would be much
appreciated, especially if anyone has seen anything like this?

Thanks a lot,
Peter Randolph

--
Peter Randolph
PhD Candidate
Mura Laboratory
Department of Chemistry
University of Virginia
(434)924.7979


Re: [ccp4bb] Alternating positive and negative density

2013-06-24 Thread Pavel Afonine
Hi Tim,


A side note: you are most likely not looking at 2Fo-Fc and Fo-Fc maps,
> but a sigma-A weighted maps and sigma-A weighted difference maps. I
> think it is worth differentiating between these terms.



Fully agree. I guess just typing them as 2mFo-DFc and mFo-DFc will solve
this particular confusion. Further on this:

- showing a map without specifying a contouring cutoff level (which also
provides information about how this map was scaled: by standard deviation,
volume or else) isn't very informative;

- was estimate of F000 included?

- which phases were used (model, Pc, or combined model+experimental, Pcomb);

- how phases were used. Example: {mcomb*Fo, Pcomb} - {DFc, Pc} vs {mFo-DFc,
Pc} and other flavors of this;

- are Fc structure factors calculated from atomic model only or it is
actually total model structure factor Fmodel = overall_scale * (Fcalc_atoms
+ Fbulk_solvent)?

All the best,
Pavel


Re: [ccp4bb] ctruncate bug?

2013-06-24 Thread Jrh
Dear Pavel,
Diffuse scattering is probably the most difficult topic I have worked on.
Reading Peter Moore's new book and his insights give me renewed hope we could 
make much more of it, as I mentioned to Tim re 'structure and dynamics'. 
You describe more aspects below obviously.
Greetings,
John
Prof John R Helliwell DSc 
 
 

On 24 Jun 2013, at 17:12, Pavel Afonine  wrote:

> Refinement against images is a nice old idea. 
> From refinement technical point of view it's going to be challenging. 
> Refining just two flat bulk solvent model ksol&Bsol simultaneously may be 
> tricky, or occupancy + individual B-factor + TLS, or ask multipolar 
> refinement folk about whole slew of magic they use to refine different 
> multipolar parameters at different stages of refinement proces and in 
> different order and applied to different atom types (H vs non-H) 
> ...etc...etc. Now if you convolute all this with the whole diffraction 
> experiment parameters through using images in refinement that will be big 
> fun, I'm sure.
> Pavel
> 
> 
> 
> On Sun, Jun 23, 2013 at 11:13 PM, Jrh  wrote:
> Dear Tom,
> I find this suggestion of using the full images an excellent and visionary 
> one.
> So, how to implement it?
> We are part way along the path with James Holton's reverse Mosflm.
> The computer memory challenge could be ameliorated by simple pixel averaging 
> at least initially.
> The diffuse scattering would be the ultimate gold at the end of the rainbow. 
> Peter Moore's new book, inter alia, carries many splendid insights into the 
> diffuse scattering in our diffraction patterns.
> Fullprof analyses have become a firm trend in other fields, admittedly with 
> simpler computing overheads.
> Greetings,
> John
> 
> Prof John R Helliwell DSc FInstP
> 
> 
> 
> On 21 Jun 2013, at 23:16, "Terwilliger, Thomas C"  
> wrote:
> 
> > I hope I am not duplicating too much of this fascinating discussion with 
> > these comments:  perhaps the main reason there is confusion about what to 
> > do is that neither F nor I is really the most suitable thing to use in 
> > refinement.  As pointed out several times in different ways, we don't 
> > measure F or I, we only measure counts on a detector.  As a convenience, we 
> > "process" our diffraction images to estimate I or F and their uncertainties 
> > and model these uncertainties as simple functions (e.g., a Gaussian).  
> > There is no need in principle to do that, and if we were to refine instead 
> > against the raw image data these issues about positivity would disappear 
> > and our structures might even be a little better.
> >
> > Our standard procedure is to estimate F or I from counts on the detector, 
> > then to use these estimates of F or I in refinement.  This is not so easy 
> > to do right because F or I contain many terms coming from many pixels and 
> > it is hard to model their statistics in detail.  Further, attempts we make 
> > to estimate either F or I as physically plausible values (e.g., using the 
> > fact that they are not negative) will generally be biased (the values after 
> > correction will generally be systematically low or systematically high, as 
> > is true for the French and Wilson correction and as would be true for the 
> > truncation of I at zero or above).
> >
> > Randy's method for intensity refinement is an improvement because the 
> > statistics are treated more fully than just using an estimate of F or I and 
> > assuming its uncertainty has a simple distribution.  So why not avoid all 
> > the problems with modeling the statistics of processed data and instead 
> > refine against the raw data.  From the structural model you calculate F, 
> > from F and a detailed model of the experiment (the same model that is 
> > currently used in data processing) you calculate the counts expected on 
> > each pixel. Then you calculate the likelihood of the data given your models 
> > of the structure and of the experiment.  This would have lots of benefits 
> > because it would allow improved descriptions of the experiment (decay, 
> > absorption, detector sensitivity, diffuse scattering and other "background" 
> > on the images,on and on) that could lead to more accurate structures in 
> > the end.  Of course there are some minor issues about putting all this in 
> > computer memory for refinement
> >
> > -Tom T
> > 
> > From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Phil 
> > [p...@mrc-lmb.cam.ac.uk]
> > Sent: Friday, June 21, 2013 2:50 PM
> > To: CCP4BB@JISCMAIL.AC.UK
> > Subject: Re: [ccp4bb] ctruncate bug?
> >
> > However you decide to argue the point, you must consider _all_ the 
> > observations of a reflection (replicates and symmetry related) together 
> > when you infer Itrue or F etc, otherwise you will bias the result even 
> > more. Thus you cannot (easily) do it during integration
> >
> > Phil
> >
> > Sent from my iPad
> >
> > On 21 Jun 2013, at 20:30, Douglas Theobald  wro

Re: [ccp4bb] Gnuplot: how to plot with resolution values as labels on x-axis?

2013-06-24 Thread Francois Berenger

On 06/25/2013 12:36 AM, Tim Gruene wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello Kay,

the rotation can be achieved with 'set xtics rotate by 30'

I did not know about xticlabels, suggested by Abhinav - very useful!


My preferred gnuplot reference:

http://security.riit.tsinghua.edu.cn/~bhyang/ref/gnuplot/index-e.html

The original site seems down unfortunately
(http://t16web.lanl.gov/Kawano/gnuplot/).


Best,
Tim

On 06/24/2013 04:54 PM, Kay Diederichs wrote:

Dear Gnuplot users,

you all know the crystallographic tables which have a column of
resolution values, and columns of crystallographic indicators (R,
I/sigma, ... whatever). Assuming that I want to plot the indicator
in column 2 as a function of resolution, I can simply say

plot 'table.dat' us 2

but the problem is now that I would like to have the resolution
values as labels, so instead of 0 1 2 3 4 5 ... I would like to
have 30.6 5.72 3.90 3.17 2.64 ... or so. Furthermore these labels
might be fairly wide, so I would like to rotate them, by (say) 30°
or even 90°. In the past, I seem to remember that I have manually
positioned the labels, as individual text strings. This can be done
for a single plot ... but then again, we live in the 3rd millenium
and there must be a better way. Can Gnuplot take the labels from
the file and put them into the right place? Could anyone please
share the Gnuplot magic for doing so?

thanks,

Kay


- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFRyGeKUxlJ7aRr7hoRAvDzAKDsFX8Hj8FiQlulx8LEW9i9IDR//QCg7Fl5
JMcIJYnPpl/3dQNyGQfKhX0=
=vgxc
-END PGP SIGNATURE-



[ccp4bb] iMosflm bug?

2013-06-24 Thread Thomas Cleveland
Has anyone else encountered this?  When I go to "processing options" in
iMosflm 1.0.7, many of the parameters on the right hand side of the window
are cut off, and there is no way to scroll over so that I can enter them.
 I've attached link to a picture of what it looks like.

https://www.dropbox.com/s/muwblcgohhxu94c/iMosflm-cut-off.png

Thanks,
Thomas Cleveland