from:"Bart Hazes"

Re: [ccp4bb] sftools and batch mode

2015-01-07 Thread Bart Hazes

I agree with Eleanor. Sftools was designed for a different purpose. I does
have a batch mode but I am not sure it will solve your problems completely
and the little notice on top already indicates that this was never
considered to be an ideal solution. Here is the MODE help page. You can
give it a try but you may have to go back to good old CAD.

Bart

selected: MODE



  Notice: This option is will become obsolete
  in the next release. Try to avoid using it in
  scripts.

  OPTION: MODE BATCH | INTERACTIVE
  

  Bring the program in or out of Batch mode
  Use MODE BATCH when running SFTOOLS in batch
  This surpresses some questions by SFTOOLS, e.g.
  to hit return to see the next page in LIST REF
  Use MODE INTERACTIVE to return to interactive
  mode if that would ever be useful

  EXAMPLES:

  MODE BATCH
  MODE INTERACTIVE


On Wed, Jan 7, 2015 at 2:57 AM, Eleanor Dodson 
wrote:

> Easier to use CAD for this!
> SFTOOLS is too clever..
> Eleanor
>
> On 7 January 2015 at 03:45, Seth Harris  wrote:
>
>> Hi all,
>>
>> I have a heterogeneous collection of mtz files I'm trying to whip into
>> some kind of standardized vocabulary shape, namely setting column names and
>> types so that subsequent scripts can sensibly make maps and so forth. I
>> have set up the ever-useful sftools to do most of this, but of course
>> sftools scripts rely on one providing a series of answers to questions you
>> think it is going to ask, and by its own admission it was designed to be
>> used interactively and includes various "protections" which, also by its
>> own admission, makes it harder to use it in batch mode... Because it finds
>> files with interesting columns (e.g. only 1's and 0's) that prompt it to
>> ask you unexpected questions (e.g. is this an X-plor Rfree column? despite
>> the "Rfree_flag title and the 5% population of 1's ; ). , for which your
>> prescribed answers no longer apply (and my log files end up with inane
>> computer v computer dialogues like "You must answer Y or N! You must answer
>> Y or N! You must answer...etc.")
>>
>> So, presumably the number of exceptional cases is finite (though tedious)
>> and I can just carry on dealing with them one after the other and learn to
>> be a better coder, but...
>>
>> My question: is there some way to turn off these protections (i.e. please
>> just read in the file without question!), or some version of SFTOOLS that
>> is more batch-friendly about which I'm not yet aware? It would be nice to
>> have something that can more programmatically interrogate mtz column
>> headers and respond sensibly rather than this kind of 20 questions you do
>> when you have to read the header and then parse the names and then ask a
>> series of "is it Rfree?" "is it CV?" "is it bigger than a breadbox?" type
>> stuff.
>>
>> Again, I know the sftools documentation is clear that the design goal was
>> for interactive use and humans have little trouble with such questions, but
>> when there might be several thousand of them...
>>
>> Thanks for any pointers or alternatives!
>>
>> Seth
>>
>>
>>
>


-- 

Bart Hazes
Associate Professor
Dept. of Medical Microbiology & Immunology
University of Alberta

Re: advice

2007-01-22 Thread Bart Hazes

I'd like to add that the value of a molecular replacement solution tends 
to be inversely correlated with the effort needed to find the solution. 
In other words, the harder you have to work to find the MR solution the 
less informative the phase information you tend to get. When you have 
very high resolution and/or NCS you may still be able to solve the 
structure. However, in cases were the search model is only distantly 
related to the protein of interest and Phaser can't find the solution, 
the solution may not be worth finding and you're better of focussing on 
getting experimental phases.


Bart

Randy J. Read wrote:


On Jan 22 2007, Eaton Lattman wrote:

Will someone knowledgeable tell me what the present state of full 6  
dimensional searches in molecular replacement?



Presumably you're referring to systematic 6D searches, not stochastic 
ones like in EPMR or QoS. Do you mean "can it be done on current 
hardware" or "is it worth doing"? If the former, then it's doable, 
though slow. In Phaser, for instance, you can generate a complete list 
of rotations (using the fast rotation function with keywords to 
prevent clustering and to save all solutions), then feed that big list 
of rotations to the fast translation search. In a typical problem that 
would probably run on a single processor in significantly less time 
than the average PhD, and could be made reasonably quick with a cluster.


If the latter, our feeling is that it isn't worth it. We've tried the 
full search option on a couple of monoclinic problems (where it's only 
a 5D search), and nothing came up with the full list of orientations 
that didn't come up with the first hundred or so orientations.


We conclude that, even in the most recalcitrant cases, the rotation 
search gives a better than random indication of whether an orientation 
is correct, so it's not necessary to search through all possible 
orientations. However, we do feel that it can be worthwhile to try a 
reasonably large number of orientations in difficult cases.


Best regards,

Randy Read

P.S. When we generate our list of orientations, we use "Lattman" 
angles to get reasonably even sampling of rotations.

Re: advice

2007-01-22 Thread Bart Hazes

Hi Filip,

You're right and the same applies if the MR is difficult because of 
differing relative domain orientations in otherwise closely related to 
proteins. As mentioned, my remark was aimed at distantly related search 
models.

Bart

Filip Van Petegem wrote:

But that isn't necessarily the case if the search is hard because your
search models individually constitute only a small part of the
asymmetric unit.  Say that 80% of the AU consists of multiple
different proteins with known structure; the phase information would
be very high if you find the solutions.

Filip

On 1/22/07, Bart Hazes <[EMAIL PROTECTED]> wrote:

I'd like to add that the value of a molecular replacement solution tends
to be inversely correlated with the effort needed to find the solution.
In other words, the harder you have to work to find the MR solution the
less informative the phase information you tend to get. When you have
very high resolution and/or NCS you may still be able to solve the
structure. However, in cases were the search model is only distantly
related to the protein of interest and Phaser can't find the solution,
the solution may not be worth finding and you're better of focussing on
getting experimental phases.

Bart

Randy J. Read wrote:

> On Jan 22 2007, Eaton Lattman wrote:
>
>> Will someone knowledgeable tell me what the present state of full 6
>> dimensional searches in molecular replacement?
>
>
> Presumably you're referring to systematic 6D searches, not stochastic
> ones like in EPMR or QoS. Do you mean "can it be done on current
> hardware" or "is it worth doing"? If the former, then it's doable,
> though slow. In Phaser, for instance, you can generate a complete list
> of rotations (using the fast rotation function with keywords to
> prevent clustering and to save all solutions), then feed that big list
> of rotations to the fast translation search. In a typical problem that
> would probably run on a single processor in significantly less time
> than the average PhD, and could be made reasonably quick with a 
cluster.

>
> If the latter, our feeling is that it isn't worth it. We've tried the
> full search option on a couple of monoclinic problems (where it's only
> a 5D search), and nothing came up with the full list of orientations
> that didn't come up with the first hundred or so orientations.
>
> We conclude that, even in the most recalcitrant cases, the rotation
> search gives a better than random indication of whether an orientation
> is correct, so it's not necessary to search through all possible
> orientations. However, we do feel that it can be worthwhile to try a
> reasonably large number of orientations in difficult cases.
>
> Best regards,
>
> Randy Read
>
> P.S. When we generate our list of orientations, we use "Lattman"
> angles to get reasonably even sampling of rotations.
>

Re: [ccp4bb] relation between wavelength and inter-atomic distances

2007-01-24 Thread Bart Hazes


Carlos Frazao wrote:

Hi,
I have once heard and recently read that "the diffraction event results 
from the fact that both the X-rays wavelength and the atomic distances 
are of the same magnitude". Although such a relation seems appealing I 
am unsure if this is not a mere coincidence. Could someone clarify or 
lead me to a relevant reading.

Cheers,
Carlos



The diffraction event does not <<< result from >>> the fact that both 
wavelength and atomic differences are of the same magnitude. But as many 
interestingly different yet related answers have indicated you need such 
a wavelength to resolve the atomic details you are interested in. 
Another way to think about it, the phase difference between the 
scattering of two atoms at distance d depends on d, the wavelength, and 
the angle of diffraction. If the wavelength is long relative to d then 
the phase difference becomes too small and you can thus not resolve the 
small details (I believe in microscopy the smallest visible detail is 
the wavelength divided by 2 or the square root of 2 depending on the 
method of illumination)


At the other end of the spectrum, making the wavelength much smaller 
than the atomic distance does not create a similar problem. I believe 
that in electron diffraction on 2-dimensional crystals the wavelength of 
the electron beam is actually many orders smaller than the atomic 
distances. The problem is that the scattering power is proporsional to 
the cube of the wavelength. So reducing the wavelength by a factor of 10 
will reduce the diffraction intensity by a factor of a 1000. Very hard 
X-rays are also harder to generate and detect. So it is attractive to 
use wavelengths that are short enough to resolve the details of interest 
but not much shorter than that. Hence the predominant use of wavelengths 
in the 1 to 1.5 Angstrom range.


Bart


==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] problem with anisotropic refinement using refmac - Sacrosanct R-free

2007-01-30 Thread Bart Hazes

er ASU
	> cell axis   76.615  76.615  209.787  
90.00  90.00 120.00
	> unique reflections  83156 (6220) 
	> Completeness98.6 % (87.9%)

> I/Sigma 21.4 (3.5)
> Rmrgd-F 5.9% (35.0%)
>
	> Maybe the refmac-script will be of some help (some 
other BFAC restraints 
	> and SPHE/RBON parameter tested, the following example 
takes care of

> reasonable distribution of anisotropy):
>
> #!/bin/bash
> refmac5 hklin  ../gz_ccp4.mtz \
	> hklout gz_aniso_01f.mtz \ 
	> xyzin  ./gz_iso.pdb \

> xyzout gz_aniso_01f.pdb \
> libin  ../llp_citrat_fitted.cif \
> << end_ip > refmac.log
	> LABI FP=F_cit_01 SIGFP=SIGF_cit_01 FREE=FreeR_flag 
	> LABO FC=FC FWT=FWT PHIC=PHIC PHWT=PHWT DELFWT=DELFWT 
PHDELWT=PHDELWT

> FOM=FOM
> NCYC 20
> REFI TYPE RESTRAINED
> REFI RESI MLKF
> REFI METH CGMAT
> REFI RESO 25 1.33
	> REFI BREF ANISOTROPIC 
	> SCAL TYPE BULK

> SCAL LSSC ANISO NCYCLES 10
> SCAL MLSC NCYCLES 10
> WEIG MATRIX 1.25
> SPHE 30.0
> RBON 30.0
> BFAC 0.5 2.0 4.0 4.0 6.0
> MAKE CHECK ALL
	> MAKE HYDROGEN ALL 
	> MAKE HOUT NO

> MAKE PEPTIDE NO
> MAKE CISPEPTIDE NO
> MAKE SSBRIDGE NO
> MAKE CHAIN YES
> MAKE SYMMETRY YES
	> MONI MANY TORS 10 DIST 10 ANGL 10 VAND 10 PLANE 10 
CHIR 10 BFAC 10 BSPH 
	> 10 RBOND 10

> BINS 20
> PNAM gz
> DNAM gz
> USEC
> END
> end_ip
>
> The final refinement statistic:
>
> Resolution limits= 25.000  1.330
> Number of used reflections   = 81889
> Percentage observed  = 98.6122
> Percentage of free reflections   = 1.5000
> Overall R factor = 0.1409
> Free R factor= 0.1681
> Overall weighted R factor= 0.1348
> Free weighted R factor   = 0.1641
> Overall correlation coefficient  = 0.9763
> Free correlation coefficient = 0.9688
> Overall figure of merit  = 0.9183
> ML based su of positional parameters = 0.0274
> ML based su of thermal parameters= 1.5420
> rmsBOND  = 0.014
> rmsANGLE = 1.569
>
> Thanks in advance,
> georg zocher
>
>





--
---
David Briggs, PhD.
Father & Crystallographer
www.dbriggs.talktalk.net
iChat AIM ID: DBassophile 




Disclaimer

This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents. 




Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.





--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] practical limits of MR?

2007-03-05 Thread Bart Hazes


Nat Echols wrote:
I had a debate with a coworker about using MR in desperation and I'm 
curious what the most extreme case is where a very different model was 
used to solve a structure.  This could be highest RMSD, lowest % 
identity, or most incomplete model.  I'm also curious whether homology 
modelling has ever been useful for this.  (I'm pretty sure I've come 
across papers discussing this last concept.)


thanks,
Nat




Hi Nat,

I once had a situation of a search model with about 20-25% sequence 
identity, but the model had not been deposited. A stereo image of a 
C-alpha trace in a nature paper was the only data I had. I picked the 
coordinates of left and right eye images and used some program to 
reconstruct the 3D Calpha trace (playing with image angle and distance 
settings to get proper helices). I use that for MR and got what I 
remember was a recently convincing solution. However, as is often the 
case in these desperate situations, the model-derived phases where too 
poor to bootstrap the refinement. After solving the structure I never 
went back to check if the original MR solution was correct and am not 
sure the files are still on a disk somewhere.


Anyway, with the improvements in software we may have reached a stage 
where the limitation of the search model is not whether or not you can 
find a MR solution, but whether or not that solution is going to help 
you determine the structure. What you can and can't get away with 
depends on the resolution of your native dataset and the power of 
density modification, in particular the presence/absence of NCS.


It's always worth a try but if finding a MR solution is a challenge you 
should consider how useful a solution, if found, is going to be.


Bart

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] A bit of history: John W Backus obit

2007-03-20 Thread Bart Hazes


The storey also made it to the CNN business page and they add...

The Fortran programming language, which was a huge leap forward in 
easing the creation of computer software, was released in 1957, said the 
report.


Backus launched his research project at IBM (Charts) four years earlier, 
assembling a diverse team of 10, including a chess wizard, a 
crystallographer and a cryptographer, said the Times.



Full story @: 
http://money.cnn.com/2007/03/20/news/newsmakers/backus/index.htm?postversion=2007032008


Bart


P.Artymiuk wrote:

A bit of history: NY Times obituary for John W. Backus, 82, developer of
Fortran, without which CCP4 and much else would not have been possible. 


http://www.nytimes.com/2007/03/19/obituaries/20cnd-backus.html?ex=1332043200&en=adde3ee5a1875330&ei=5124&partner=permalink&exprod=permalink

Pete A






--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Generate Random phase set.

2007-03-20 Thread Bart Hazes

SFTOOLS has a random number generator that produces a uniform 
distribution between 0 and 1, which you can than use in different ways.


You could then select all non-centric reflections and asign their phase 
as RanU*360.
For the centric reflections you can first set them to the P-variable in 
the CALC command which substitutes the restricted phase value for 
centric reflections (0 or 90 degrees). Next you can use the random 
number to add 180 degrees to half of the centric reflections.


READ in.mtz
CALC col RanU = RAN_U
SELECT NOT CENTRO
CALC P col Phase = col RanU 360 *
SELECT INVERT
CALC col Phase = P
SELECT col RanU > 0.5
CALC col Phase = 180 +
SELECT ALL
WRITE out.mtz

This is clearly advanced use of the CALC/SELECT commands and even I had 
to check the help pages, but I thought it was a nice example of the 
flexibility.


Bart

Eleanor Dodson wrote:
You can do that to some extent by scattering random atoms about and 
calculating phases from them..

Then the centrics will be sensible.
That is how some direct methods programs get their "random starting set" 
of phases..

Depends on how flat you want the starting map to be
 Eleanor


David Briggs wrote:


Hi y'all.

Excuse that rather "noddy" question, but I've been googling this for 
hours now and I've finally lost patience...


How can I generate a random phase set for either a .mtz or .hkl (cns 
format) reflection file (if possbile with sensible values for centric 
reflections).


That's it.

Thanks in advance,

Dave

--
---
David Briggs, PhD.
Father & Crystallographer
www.dbriggs.talktalk.net <http://www.dbriggs.talktalk.net>
iChat AIM ID: DBassophile
---
Anyone who is capable of getting themselves made President should on 
no account be allowed to do the job. - Douglas Adams 







--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

[ccp4bb] [Fwd: Re: [ccp4bb] question about redundancy]

2007-03-20 Thread Bart Hazes


Hi Li, There is nothing wrong with your opinion the expected redundancy
is indeed 4 because, as you say, each reflection intersects the Ewald
sphere twice, once through the top and once through the bottom
(redundancy will actually be a bit less due to the missing cusp region).

Bart


yang li wrote:

Hi:
  I am confused by a idea for a long time, and it maybe an easy question.
That is, if a cryst with p1 spacegroup, after 360 degree data collection,
what the redundancy should be? I was told that it is 2, and the 2 are
F(h,k,l) and F(-h,-k,-l), but I think if the reciprocal lattice rotated 360
degree, it must intersect with the ewald sphere 2 times--for example,
go into the sphere and go out--then F(h,k,l) should be collected twice.
I donnot know what is wrong with my option. Anyone can tell me the
reason? Thanks!

Li Yang




--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==


--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Highest shell standards

2007-03-21 Thread Bart Hazes


Shane Atwell wrote:
Could someone point me to some standards for data quality, especially 
for publishing structures? I'm wondering in particular about highest 
shell completeness, multiplicity, sigma and Rmerge.


A co-worker pointed me to a '97 article by Kleywegt and Jones:

_http://xray.bmc.uu.se/gerard/gmrp/gmrp.html_

"To decide at which shell to cut off the resolution, we nowadays tend to 
use the following criteria for the highest shell: completeness > 80 %, 
multiplicity > 2, more than 60 % of the reflections with I > 3 sigma(I), 
and Rmerge < 40 %. In our opinion, it is better to have a good 1.8 Å 
structure, than a poor 1.637 Å structure."


Are these recommendations still valid with maximum likelihood methods? 
We tend to use more data, especially in terms of the Rmerge and sigma 
cuttoff.


Thanks in advance,

*Shane Atwell*



Hi Shane,

I definately no longer support the conclusions from that 1997 paper and 
I think Gerard probably has adjusted his thoughts on this matter as 
well. Leaving out the data beyond 1.8A (in the example above) only makes 
sense if there is no information in those data. Completeness and 
multiplicity are not direct measures of data quality and the 60% 
I>3sigma and Rmerge <40% criteria are too strict to my liking. I prefer 
to look at I/SigI mostly, and as a reviewer I have no problems with 
highest resolution shell stats with I/SigI anywhere in the 1.5-2.5 
range. I won't complain about higher I/SigI values if done for the right 
reasons (phasing data sets being the most common), but will say 
something if they state their crystals diffract to 2.5A if the I/SigI in 
the highest resolution shell is, let's say, 5. Their crystals don't 
diffract to 2.5A, they just didn't let the crystals diffract to their 
full potential. You can't really reject papers for that reason, but 
there appears to be a conservative epidemic when it comes to restricting 
the resolution of the data set.


Bart

--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Highest shell standards

2007-03-23 Thread Bart Hazes


Santarsiero, Bernard D. wrote:



As for the 2Fo-Fc (2mFo-DFc, or something like that) electron density map,
it again assumes that the phases are in good shape, and you essentially
lose any new information you could gain from the addition of new Fobs
terms, but the map isn't distorted since the terms are zero.


Fc is certainly not zero. D will go towards zero as the correlation 
between Fobs and Fcalc decreases but may still be substantial for the 
higher resolution shells. I expect that data sets that still have a 
relatively high I/SigI in the outer shell will also have quite high D 
values in the outer shell whereas datasets that push the limits of the 
true resolution will start seeing D decrease towards zero (haven't 
actually checked this though).


The most significant problem are data sets collected on square detectors 
with relatively strong reflections still visible in the corners. In 
these cases high resolution completeness is low and many missing 
reflections will be replaced by DFc, with both D and Fc not being close 
to zero.


I must admit that I've never understood the rationale for including dFc 
for missing terms, although it has been discussed "lively" on a few 
occassions. Yes, dFc is a better estimate for the true structure factor 
than leaving out the term (equivalent to setting the amplitude to zero). 
But dFc does not provide any information on how reliable the model is or 
where the model may need to be changed. Since the latter is the main 
function of electron density maps in my opinion I'm not confinced that 
substituting dFc is a good idea at any time.


Bart




On Fri, March 23, 2007 4:02 am, Eleanor Dodson wrote:


This is a good point - I had thought that D would be very low for an
incomplete shell, but that doesnt seem to be true..

Garib - what do you think?
Eleanor


Petrus H Zwart wrote:


I typically process my data to a maximum I/sig near 1, and
completeness in
the highest resolution shell to 50% or greater. It



What about maps computed of very incomplete datasets at high resolution?
Don't you get a false sense of details when the missing reflections are
filled in with DFc when computing a 2MFo-DFc map?

P











--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

[ccp4bb] Chi-by-Eye[ Highest shell standards]

2007-03-24 Thread Bart Hazes


James Holton wrote:

I generally cut off integration at the shell wheree I/sigI < 0.5 and 
then cut off merged data where MnI/sd(I) ~ 1.5.  It is always easier 
to to cut off data later than to re-integrate it.  I  never look at 
the Rmerge, Rsym, Rpim or Rwhatever in the highest resolution shell.  
This is because R-statistics are inappropriate for weak data.



I wonder how many peopled decide their max resolution cut-off (either 
during processing or by setting the detector too far away) by eyeballing 
a few images. I wouldn't be surprised if the number is substantial. Has 
anyone looked at the actual resolution of a data set after processing 
and the first-guess resolution reported by people just looking at the 
first few images. With the generally positive trend to increase 
redundancy at the cost of pushing the exposure per image, the resolution 
visible to the human eye may be an even poorer estimate of true resolution.


Bart

Re: [ccp4bb] Strange behavior in R32

2007-04-04 Thread Bart Hazes


Hi Dan,

In hexagonal R32 setting there is translational crystallographic 
symmetry leading to systematic absences. If you think Xtal 5 is 
basically the same but with slightly different packing that break the 
symmetry then you will have pseudo-translational symmetry leading to one 
set of very strong intensities (those that obey the hexagonal R32 
lattice rules) and a set with very weak intensities (the once that are 
systematically absent in R32). That abundance of very strong and very 
weak intensities would give you a bimodal intensity distribution that 
would lead to the observed very large second moment in truncate. It is 
not immediately clear though what is going on with the other xtals. It 
may be of interest to look at the actual intensity distributions for 
different resolution slices to see if it is bimodal or not.


Bart

[EMAIL PROTECTED] wrote:

Dear Friends,

I have some crystals of a small RNA in sg R32 that exhibit some bizzare
behaviors and fail to give phasing solutions. I am hoping someone out there
might be able to lend some insight. All crystals come out of the same condition
and look the same morphologically but, for reasons unknown and at this point
uncontrollable, have very different cell dimensions:

Xtal 1: R32 77 x 77 x 80 Ang
Xtal 2: R32 77 x 77 x 86 Ang
Xtal 3: R32 78 x 78 x 366 Ang
Xtal 4: R32 78 x 78 x 460 Ang
Xtal 5: P3(1)21 77 x 77 x 85 Ang

I've collect >95% complete datasets for each crystal form to resolutions between
2.2-3.0 Ang. Denzo/scalepack outputs look fine in each case. Overall R-sym's
are okay (range is 8-13%). The only apparent red flag is the 2nd moment of I
calculated in truncate. These values are much larger than those expected for
either twinned (1.5) or untwinned (2.0) data:

Xtal 1: 3.2
Xtal 2: 2.5
Xtal 3: 4.9
Xtal 4: 3.7
Xtal 5: 4.2

Also, use of the Yeates server to look for partial twinning tells me there are
no twin laws for R32, while the P3(1)21 data does not seem to be twinned. 


I'm guessing crystals 1 & 2 are very similar but still somewhat non-isomorphous,
and crystal 5 is also very similar with a breakdown in space group symmetry
that would put 3 mols/ASU instead of 1 mol/ASU as in R32. But I do not
understand what is going on with crystals 3 & 4. Has anyone out there
experienced similar non-integer multiples of a cell dimension for a given
crystal? I have covalently attached iodine in some of these crystal forms. But
no luck finding any sites by SAD, and I can't help but wonder if this funny
c-dimension behavior and/or the high /^2 values are indicative of some
greater crystal pathology.

Thanks in advance,
Dan





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

[ccp4bb] website back up

2007-05-22 Thread Bart Hazes


Hello all!

Our good old 100MHz pentium web server died a (too long) while ago and 
while it was down several people have asked when it would be back up. I 
reinstalled it over the weekend and as far as I can see it all works 
again. If you had been using it you can do so again at the same address 
(http://eagle.mmid.med.ualberta.ca/).


Cheers, Bart

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] B-factor & Space gr questions!

2007-06-05 Thread Bart Hazes

I think the relevant point in this discussion is that the original paper 
discussed the apo and substrate complexes of the protein. For the 
structure with lower resolution data you may indeed get a better model 
by taking the high resolution model and just applying rigid body 
refinement to it. After that step you would like to find and model the 
differences between the two structures. This includes the bound 
substrate (or the lack thereof) and any significant structural changes 
that accompany substrate binding. Significant meaning those changes that 
can be reliably determined at the lower resolution. For most of the 
structure that may mean you are best off by simply taking the 
rigid-body-refined coordinates of the higher resolution structure 
without further refinement. I see no problem in doing so and as long as 
interesting differences between the structures can be clearly defined 
and the procedure is explicity described in publications this should be 
perfectly reasonable.


Bart

Edward A Berry wrote:

You have a good point there and I would be interested in hearing
some other opinions, so I take the liberty of reposting-

My instinctive preference is that each structure should be
supported solely by the data that is deposited with it -
(one dataset one structure) but in terms of good science
we want to produce the best model we can, and that might be
the rigid-body-located structure from another dataset.
In particular the density for the ligand might be clearer
before overfitting with the low resolution data.

Even if the free-R set is not preserved for the new crystal,
R and R-free tend to diverge rapidly once any kind of
fitting with a low data/param is performed, so I think
the new structure must not have been refined much beyond
rigid body (and over-all B which is included in any kind
of refinement).  And that choice may be well justified.
Ed

cdekker wrote:


Hi,

Your reply to the ccp4bb has confused me a bit. I am currently 
refining a low res structure and realise that I don't know what to 
expect for final R and Rfree - it is definitely not what most people 
would publish. So the absolute values of R and Rfree are not telling 
me much, the only gauge I have is that as long as both R and Rfree are 
decreasing I am improving the model (and yes, at the moment that is 
only rigid body refinement).
In your email reply you suggest that even though a refinement to 
convergence that will lead to an increased Rfree (and lower R? - a 
classic case of overfitting!) would be a better model than the 
rigid-body-refined only model. This is what confuses me.
I can see your reasoning that starting with an atomic model to solve 
low-res data can lead to this behaviour, but then should the solution 
not be a modification of the starting model (maybe high B-factors?) to 
compensate for the difference in resolution of model and data?


Carien

On 4 Jun 2007, at 19:38, Edward A Berry wrote:


Ibrahim M. Moustafa wrote:

The last question: In the same paper, for the complex structure R 
and Rfree are equal (30%) is that an indication for improper 
refinement in these published structure? I'd love to hear your 
comments on that too.


Several times I solved low resolution structures using high resolution
models, and noticed that R-free increased during atomic positional
refinement.  This could be expected from the assertion that after
refinement to convergence, the final values should not depend on
the starting point: If I had started with a crude model and refined
against low resolution data, Rfree would not have gone as low as the
high-resolution model, so if I start with the high resolution model
and refine, Rfree should worsen to the same value as the structure
converges to the same point.

Thinking about the main purpose of the Rfree statistic, in a very
real way this tells me that the model was better before this step
of refinement, and it would be better to omit the minimization step.
Perhaps this is what the authors did.

   On the other hand it does not seem quite right submit a model that
has simply been rigid-body-refined against the data- I would prefer to
refine to convergence and submit the best model that can be supported
by the data alone, rather than a better model which is really the model
from a better dataset repositioned in the new crystal.

Ed




The Institute of Cancer Research: Royal Cancer Hospital, a charitable 
Company Limited by Guarantee, Registered in England under Company No. 
534147 with its Registered Office at 123 Old Brompton Road, London SW7 
3RP.


This e-mail message is confidential and for use by the addressee 
only.  If the message is received by anyone other than the addressee, 
please return the message to the sender by replying to it and then 
delete the message from your computer and network.







--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology &a

Re: [ccp4bb] DANO from PDB

2007-06-13 Thread Bart Hazes


Eleanor Dodson wrote:
Well - the old way to estimate sigma was Sqrt(I**2 + 
constant_representing background) and then get

Sigma_F as ~ sqrt(SigI)/(2*F ) .
sftools would calculate that for you and append it to the output file..
Eleanor


I suggested something similar to Pete but sufficiently different that 
I'd like to post it. I expect that in the example above I**2 should be 
F**2 otherwise SigI is proportional to I when I >> background rather 
than proportional to Sqrt(I) as expected for pure counting statistics.


Bart

For any purist there is no good way. If you look for something that you 
can explain concisely in a methods section and that has at least some 
logic to it you could convert your Fcalc to intensities (F**2). Multiply 
this by a conversion factor C with C being four divided by the average 
intensity at the highest resolution of the data set (4/). Take the 
square root of this as your SigmaI.
The idea here is to convert the calculated intensity to photon counts 
recorded in the experiment. A "typical" data set has about I/SigI of 2 
at the high resolution. Since SigI is the square root of I if it is 
solely dependent on counting statistics, setting C to give an average I 
of 4 in the highest resolution shell should give an I/SigI of about two 
after you set SigI to the square root of I.


I don't actually expect that this will closely mimic an experimental 
I/SigI versus resolution pattern but it should be easy to calculate with 
sftools so you can go ahead and give it a try.


Bart



Eleanor Peter Adrian Meyer wrote:


I add a fake "sigma" column for each "data" column because so many



programs require one.

This is slightly tangential, but does anyone know of a good way to
generate semi-realistic sigma values for calculated/simulated data?

The best I've been able to do is borrow from an experimental dataset of
the same protein (after scaling), but that doesn't work unless you've got
an experimental dataset corresponding to your simulated one.  I also 
tried

a least-squares fit (following a reference I don't have in front of
me...this was a while ago), which didn't result in a good fit for our
data.

Pete

Pete Meyer
Fu Lab
BMCB grad student
Cornell University


  







--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Survey on computer usage in crystallography

2007-06-20 Thread Bart Hazes

I don't expect the "I'd be willing to assemble one from parts for under 
USD $2000" crowd will be large but you don't have to do the assembling 
to get all you need well under the $2000 mark. The times that you needed 
the fastest computer money could buy and still spent lots of time in the 
library reading while your rigid body refinement was chewing on the next 
cycle is long gone.


Bart

David J. Schuller wrote:

On Tue, 2007-06-19 at 21:09 -0700, P Hubbard wrote:


Hi all,

I am doing a survey on computer usage in crystallography. The questionnaire 
can be found on the following web page:


http://www.bioscienceforum.com/survey.html


...

I don't care for the way several questions are posed. Examples:

"4.What would you consider is a reasonable price to pay for a
computer graphics workstation designed with crystallography in mind?

A) USD $2000-3000
B) USD $3000-5000
C) USD $5000+
"

What, no option for "I'd be willing to assemble one from parts for under
USD $2000"?

...
"If there were a choice, would you prefer stereo graphics displayed
using LCD-shuttered glasses or a head mounted display (often referred to
as a Virtual Reality headset)?"

That does not cover all the choices currently available, let alone what
I wish were available.





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Survey on computer usage in crystallography

2007-06-20 Thread Bart Hazes


Hi Paul,

I didn't intend to criticise the survey, I actually filled it out 
immediately, and am interested in the outcome. Perhaps more so for the 
issue of easing crystallographic software installation and updating than 
the stereographics part.


Wrt stereographics; I bought three sets of NuVision stereographics units 
with glasses when I started up my lab. I personally prefer to use 
side-by-side stereo and cross my eyes and most of my trainees don't 
appear to be interested in using the hardware stereo option. So the 
glasses mostly get used to impress high-school students and other visitors.


In hind-sight I would have gotten one stereo-ready setup with 2 or 3 
sets of glasses. Similarly, I think it is effective to have one (or a 
few depending on size of lab) higher-end number cruncher with large 
memmory and disk storage and pretty basic PC's for individual users.


Bart

P Hubbard wrote:

Hi,

If you build a system without stereo graphics, and use a standard 
monitor, I agree. In-fact it can be done for for under ~$1000. However, 
stereo systems are rather expensive (LCD stereo systems are VERY 
expensive).


I was just curious to see how popular stereo graphics is for 
crystallographers. I personally think its a wonderful teaching tool 
which is currently under-utilized.


Paul


From: Bart Hazes <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Survey on computer usage in crystallography
Date: Wed, 20 Jun 2007 09:28:25 -0600

I don't expect the "I'd be willing to assemble one from parts for 
under USD $2000" crowd will be large but you don't have to do the 
assembling to get all you need well under the $2000 mark. The times 
that you needed the fastest computer money could buy and still spent 
lots of time in the library reading while your rigid body refinement 
was chewing on the next cycle is long gone.


Bart

David J. Schuller wrote:


On Tue, 2007-06-19 at 21:09 -0700, P Hubbard wrote:


Hi all,

I am doing a survey on computer usage in crystallography. The 
questionnaire can be found on the following web page:


http://www.bioscienceforum.com/survey.html



...

I don't care for the way several questions are posed. Examples:

"4.What would you consider is a reasonable price to pay for a
computer graphics workstation designed with crystallography in mind?

A) USD $2000-3000
B) USD $3000-5000
C) USD $5000+
"

What, no option for "I'd be willing to assemble one from parts for under
USD $2000"?

...
"If there were a choice, would you prefer stereo graphics displayed
using LCD-shuttered glasses or a head mounted display (often referred to
as a Virtual Reality headset)?"

That does not cover all the choices currently available, let alone what
I wish were available.


==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] difference density ripples around Hg atoms

2007-08-01 Thread Bart Hazes


Hi Klemens,

As friends of the Fourier transform we hate to see it truncated. 
Although others don't think this is your problem I personally think it 
very well may be. To get a truncation effect you must first have 
truncated your data.


- Is the I/SigI of your highest resolution data in the 1-2 region or 
more like 3 or higher?


- Second, truncation ripples are just that oscillating negative and 
positive shells of density around the central atom density. The first 
negative ripple will be strongest strongest, but if you contour lower 
you may be able to see a second positive one at a little greater 
distance (you do say "ripple layers" so you may already have spotted it).


The bad news is that as far as I know there is no remedy. The ripples 
are not due to your model so no refinement trick can help you out (when 
you would have perfect experimental phases you would still see the ripples).
You can apply a de-sharpening B-factor to the data to weaken the high 
resolution terms. That would dampen the ripples but also harm the rest 
of your data.


The good news is that the ripples don't really affect your model or the 
biological conclusions you derive from it. In the paper you will just 
have to confess that you didn't do your data collection properly and 
then get on with the show. Unfortunately, there are far too many papers 
with native data sets that do not collect data to the diffraction limit. 
I think we need a "Save the Native Structure Factor" action group to 
protect the endangered high resolution native reflections. This is 
ALWAYS bad (the exception is for experimental phasing data sets) but 
only when you have a heavy atom do you see the ripples (I have had it 
myself with an ion as light as copper).


W.r.t. Kay's reply I think the argument does not hold since it depends 
on how badly the data is truncated. E.g. truncated near the limit of 
diffraction will give few ripples whereas a data set truncated at I/SigI 
of 5 will have much more servious effects.


Bart

Kay Diederichs wrote:

Klemens Wild schrieb:


Dear friends of the Fourier transform,

I am refining a structure with 2 adjacent Hg atoms bound to cysteines 
of different monomers in the crystal contacts, which means I need to 
refine them as well. While the structure nicely refines (2.2 A data), 
I do not get rid of negative density ripple layers next to them (-10 
sigmas). My question: is this likely due to anistropy of the soft 
mercury atoms (anisotropic  B refinement decreases the ripples) or is 
this likely a summation truncation effect prominent for heavy atoms? 
Can I just anistropically refine the mercuries while I keep the rest 
isotropic? Never saw this in a PDB. Suggestions are very welcome.


Greetings

Klemens Wild



Dear Klemens,

the height of a Fourier ripple should not exceed about 12% of the peak 
itself (just look at the maxima of sin(x)/x which is the Fourier 
transform of a truncation function). In reality it should even be lower 
due to the average temperature factor being >0.
Thus, only if your Hg peaks are on the order of 80 sigmas (which I 
doubt) it appears justified to consider the 10 sigma peaks as ripples.


It is more likely that aniso refinement should be able to get rid of the 
"ripples".


best,
Kay



--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] difference density ripples around Hg atoms

2007-08-01 Thread Bart Hazes


Sorry Kay,

I completely agree with you and should have read your message more 
carefully before jumping to conclusions. I thought you suggested the 
ripples where not strong enough ... I'd better have my coffee now :)


Anyway, I don't think I wasted your time because your expanded 
explanation of the convolution theorem on this particular case is very 
useful as a reminder of this important concept.


Bart



Kay Diederichs wrote:

Bart Hazes schrieb:
...

W.r.t. Kay's reply I think the argument does not hold since it depends 
on how badly the data is truncated. E.g. truncated near the limit of 
diffraction will give few ripples whereas a data set truncated at 
I/SigI of 5 will have much more servious effects.


Bart



Bart,

if you truncate at the limit of diffraction (i.e. where there is no more 
signal) you will not get any ripple at all !


Of course, if you truncate at a resolution where there is significant 
signal (and I do agree with you in that respect: many people truncate 
their datasets at too low resolution) there _will_ be Fourier ripples. 
However, a ripple is never as high than the peak itself.


To get a quantitative picture of the worst-case scenario, consider the 
following: truncation means multiplication of the data with a Heaviside 
function (that is 1 up to the chosen resolution limit, and 0 beyond). In 
real space, this translates into a series of ripples, arising by 
convolution of the true electron density with the Fourier transform of 
the Heaviside function. The Fourier transform of a one-dimensional 
Heaviside function is the function sin(x)/x . Convolution with sin(x)/x 
has the effect of
a) broadening (or "smearing") the true electron density, resulting in a 
low-resolution electron density map instead of the true one
b) adding ripples at certain distances (which can be calculated from the 
resolution) around each peak. The first negative ripple has an absolute 
value of less than 1/4 of the peak height, and the first positive ripple 
about 1/8 of the peak height.


So in the worst case (one-dimensional truncation of data) my estimate of 
12% was wrong - I estimated the height of the first positive ripple 
whereas Klemens reported the first negative ripple!


On the other hand, if I remember correctly, the Fourier transform of the 
3-dimensional Heaviside function (a filled sphere) is a Bessel function 
that has ripples which (I think) are lower than those of the 
one-dimensional Heaviside function. Surely somebody knows the function, 
and its peak heights?


best,

Kay



--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] centrosymm structure

2007-08-24 Thread Bart Hazes


Hi Bernhard,

Type: centrosymmetric rubredoxin
in google and you'll get a few papers including what I think was the 
original work by Zawadzke & Berg.


Bart

Zawadzke LE, Berg JM.   
Related Articles, Links
AbstractThe structure of a centrosymmetric protein crystal.
Proteins. 1993 Jul;16(3):301-5.

Crystals of racemic rubredoxin, prepared by independent chemical 
synthesis of the two enantiomers, have been grown and characterized. The 
unit cell contains two molecules, one of each enantiomer. Examination of 
the intensity distribution in the diffraction pattern revealed that the 
crystals are centrosymmetric. This was confirmed by solution of the 
structure to 2 A resolution via molecular replacement methods. The 
electron density maps are of very high quality due to the fact that the 
phase of each reflection must be exactly 0 degrees or exactly 180 
degrees. These results demonstrate the feasibility of using synthetic 
racemic proteins to yield centrosymmetric protein crystals with electron 
density maps that have very low phase error and model bias.


Bernhard Rupp wrote:

Dear All,

there was a paper (quite) a while ago where someone made
for the first time a racemic protein mixture, obtained a 
centrosymmetric structure and solved it (not the 2003

PNAS paper by the Eisenberg grp).

Hints appreciated.

Thx, br
-
Bernhard Rupp
001 (925) 209-7429
+43 (676) 571-0536
[EMAIL PROTECTED]
[EMAIL PROTECTED] 
http://www.ruppweb.org/ 
-

People can be divided in three classes:
The few who make things happen
The many who watch things happen
And the overwhelming majority 
who have no idea what is happening.

-





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Strange diffraction images

2007-08-27 Thread Bart Hazes

I believe Wayne Hendrickson's lab has had such a case with a 10-fold
symmetric mollusc hemocyanin crystal. This must have been in the early
90's and to my knowlwedge they were never able to solve the structure
even though it diffracted beyond 2 Anstrom.

I'm not sure if this work has been published but you can check the paper
describing a single domain of this protein complex or contact one of its
authors.

Bart

J Mol Biol. 1998 May 15;278(4):855-70.

Crystal structure of a functional unit from Octopus hemocyanin.
Cuff ME, Miller KI, van Holde KE, Hendrickson WA.

Jacob Keller wrote:
I am still eagerly awaiting a biomacromolecular quasicrystal with a five-fold symmetric diffraction
pattern. It seems that this is entirely possible, if one gets roughly Penrose-tile shaped oligomers
somehow. But wow, how would you solve that thing? I guess one would have to modify software from

the small molecule or matsci folks.

Jacob

==Original message text===
On Mon, 27 Aug 2007 11:19:15 am CDT "George M. Sheldrick" wrote:

Some small molecule crystallographers have specialized in solving and
refining structures that, exactly as you describe it, consist of two (or
more) interpenetrating, non-commensurable lattices. The usual approach is
to decribe the crystal in up to six dimensional space. The programs SAINT
and EVALCCD are able to integrate such diffraction patterns and
SADABS is able to scale them. However the case in point is probably
commensurate.

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,

Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582

On Mon, 27 Aug 2007, Jacob Keller wrote:

What a beautiful and interesting diffraction pattern!

To me, it seems that there is a blurred set of spots with different cell
dimensions, although
nearly the same, underlying the ordered diffraction pattern. A possible
interpretation occurred to
me, that the ordered part of the crystal is supported by a less-ordered lattice
of slightly
different dimensions, which, because the crystal is a like a layer-cake of 2-d
crystals, need not
be commensurable in the short range with the ordered lattice. The nicely-ordered "cake" part of the
crystal you solved, but the "frosting" between is of a different, less ordered nature, giving rise

to the diffuse pattern which has slightly different lattice spacing. I would
have to see more
images to know whether this apparent lattice-spacing phenomenon is consistent,
but it at least
seems that way to me from the images you put on the web. I would shudder to
think of indexing it,
however.

All the best,

Jacob Keller

ps I wonder whether a crystal was ever solved which had two interpenetrating,
non-commensurable
lattices in it. That would be pretty fantastic.

Jacob,

Some small molecule crystallographers have specialized in solving and
refining structures that, exactly as you describe it, consist of two
interpenetrating, non-commensurate lattices. The usual approach is
to index the diffraction pattern in multiple dimensional space
('superspace'). The programs SAINT and EVALCCD are able to integrate
diffraction patterns in up to six dimensions, SADABS is able to scale
them and the refinement is almost always performed with Petricek's
program JANA2000:

http://www-xray.fzu.cz/jana/Jana2000/jana.html
However the case in point is probably commensurate.

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582
===End of original message text===

***
Jacob Keller
Northwestern University
6541 N. Francisco #3
Chicago IL 60645
(847)467-4049
[EMAIL PROTECTED]
***

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone: 1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] alternating strong/weak intensities in reciprocal planes - P622

2007-08-27 Thread Bart Hazes


Hi Jorge,

The strong h, k, l=2n and weak h, k, l=2n+1 pattern suggest pseudo body 
centering. Does the off-origin Patterson peak lie at/near 0.5 0.5 0.5?


You could get pseudo body centering if an NCS 2-fold lies parallel to a 
crystallographic 2(1) or 6(3) screw axis, with the NCS 2-fold a quarter 
(not half) of a unit cell distant from the crystallographic axis.


The fact that you get good merging statistics in P622 even at the high 
resolution limit suggests to me that you either have that space group or 
a lower symmetry subgroup with a nearly 0.5 twin fraction.


Even if you figure out completely what your pathological crystal 
conditions are it may be hard to refine the structure properly. In some 
cases crystals can snap from a pseudo- to a proper crystal by adding the 
right additive. This may be worth trying while you break your head on 
this case.


One problem is that whenever you make a model that obeys the pseudo body 
centering you are going to get a significant R-factor and correlation 
coefficient, even if the actual model is wrong. If you get a clear 
rotation function solution, which is not affected by the pseudo 
translation, it may still work but otherwise it could be hard to know if 
you got the right solution or not. Trying a whole bunch of rotation 
function solutions and see which one will refine to a significantly 
lower R-free is one thing to try.


Bart

Jorge Iulek wrote:

Dear all,

Please, maybe you could give some suggestions to the problem below.

1) Images show smeared spots, but xds did a good job integrating them. 
The cell is 229, 229, 72, trigonal, and we see alternating strong and 
weak rows of spots in the images (spots near each other, but rows more 
separated, must be by c*). They were scaled with xscale, P622 (no 
systematic abscences), R_symm = 5.3 (15.1), I/sigI = 34 (14) and 
redundancy = 7.3 (6.8), resolution 2.8 A. Reciprocal space show strong 
spots at h, k, l=2n and weak spots at h, k, l=2n+1 (I mean, l=2n 
intensities are practically all higher than l=2n+1 intensities, as 
expected from visual inspection of the images). Within planes h, k, 
l=2n+1, the average intensity is clearly and "much" *higher at high 
resolution than at low resolution*. Also, within planes h, k, l=2n, a 
subjective observation is that average intensity apparently does not 
decay much from low to high resolution. The data were trucated with 
truncate, which calculated Wilson B factor to be 35 A**2.


2) Xtriage points a high (66 % of the origin) off-origin Patterson peak. 
Also, ML estimate of overall B value of F,SIGF = 25.26 A**2.


3) I suspect to have a 2-fold NCS parallel to a (or b), halfway the c 
parameter, which is "almost" crystallographic.


4) I submitted the data to the Balbes server which using 
pseudo-translational symmetry suggested some solutions, one with a good 
contrast to others, with a 222 tetramer, built from a structure with 40 
% identity and 58% positives, of a well conserved fold.


5) I cannot refine below 49 % with either refmac5, phenix.refine or CNS. 
Maps are messy, except for rather few residues and short stretches near 
the active site, almost impossible for rebuilding from thereby. Strange, 
to me, is that all programs "freeze" all B-factors, taking them the 
program minimum (CNS lowers to almost its minimum). Might this be due to 
by what I observed in the reciprocal space as related in "1" ? If so, 
might my (intensity) scaling procedure have messed the intensities due 
to their intrinsic "property" to be stronger in alternating planes ? How 
to overcome this ?


6) I tried some different scaling strategies *in the refinement step*, 
no success at all.


7) A Patterson of the solution from Balbes also shows an off-origin 
Patteron at the same position of the native data, although a little lower.


8) Processed in P6, P312 and P321, all of course suggest twinning.

I would thank suggestions, point to similar cases, etc... In fact, 
currently I wondered why refinement programs take B-factor to such low 
values


Many thanks,

Jorge





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] R-sleep

2007-10-01 Thread Bart Hazes

I haven't read the paper, so perhaps shouldn't say anything yet, but 
here goes.


For me Rfree is primarily a tool to help choose the refinement protocol, 
set the relative weight for geometry restraints versus crystallographic 
data, B-value restraints etc. Trying different parameter settings and 
picking the one that reduces Rfree the most is what, in my mind, Rfree 
was designed for. Sure there is statistical noise in Rfree and by 
picking the lowest Rfree you may be selecting for "favourable noise" 
rather than the best model but it is still your best indicator for model 
quality and the quality differences between models with very similar 
Rfree values is probably not worth loosing (R)sleep over.


The big difference, I think, is that in refinement the big enemy is a 
too low observation/parameter ratio with Rfree acting as the indicator 
to reduce overfitting. In selecting appropriate settings for a few 
global parameters there just isn't the same risk of overfitting. Using 
multi-start torsion-angle refinement and picking the solution with the 
lowest Rfree is not that different. Are you really biasing Rfree by 
picking the run with the lowest value or are you truly picking the best 
solution? Even if the solution you picked was not the very best due to 
statistical noise in Rfree, in continuing refinement the statistical 
benefits are probably not going to carry over into the rest of the 
refinement.


I'm sure there is going to be a lot of different opinions on this one...

Bart

Mark J. van Raaij wrote:

Dear All,

the short paper by Gerard Kleywegt (ActaD 63, 939-940) treats an 
interesting subject (at least I think so...). I agree that what we are 
now doing in many cases is effectively refining against Rfree. For 
example, the standard CNS torsion angle refinement does n refinement 
trials with randomised starting points. If you then take the one with 
lowest Rfree (or let a script do this for you), you are biasing Rfree!
Therefore, his proposal to put an extra set of reflections in a dormant 
"vault" (R-sleep) sounds like a good idea to me. However, how would the 
"vault" be implemented to be effective? If left to the experimenter, it 
would be very tempting to check R-sleep once in a while (or often) 
during refinement, rendering it useless as an unbiased validator. 


or am I being paranoid and too pessimistic?

Mark J. van Raaij
Unidad de Bioquímica Estructural
Dpto de Bioquímica, Facultad de Farmacia
and
Unidad de Rayos X, Edificio CACTUS
Universidad de Santiago
15782 Santiago de Compostela
Spain
http://web.usc.es/~vanraaij/





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] carving up maps (was re: pymol help)

2007-10-29 Thread Bart Hazes


Anastassis Perrakis wrote:

Dear Andrew,

Thank you for that posting; I would like to simply agree with the 
Bobscript manual and your suggested practice.


I think the 'carve' commands should not be there; if you wonder why, 
take a ligand, put it wherever you want in space,
set the map sigma to -0.5, display a map with carve=1.2 and think if 
this picture is informative, especially in the context
of your favorite competitor publishing it in Nature. 


A.



The fact that a tool can be misused does not necessarily mean that there 
is something wrong with the tool, just with some users. I agree that 
with current ray-traced images there is less need for this tool than in 
the old black&white line diagrams where a lack of 3D perception easily 
led to cluttered images. However, if showing the density of interest 
really benefits from this trick then it's fine with me as long as you 
indicate in the legend what map and carve settings were used.


Bart

--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] an over refined structure

2008-02-08 Thread Bart Hazes

Dale Tronrud wrote:

[EMAIL PROTECTED] wrote:
 > Rotational near-crystallographic ncs is easy to handle this way, but
 > what about translational pseudo-symmetry (or should that be
 > pseudo-translational symmetry)? In such cases one whole set of spots is
 > systematically weaker than the other set.  Then what is the
 > "theoretically correct" way to calculate Rfree?  Write one's own code to
 > sort the spots into two piles?
 > Phoebe
 >

Dear Phoebe,

   I've always been a fan of splitting the test set in these situations.
The weak set of reflections provide information about the differences
between the ncs mates (and the deviation of the ncs operator from a
true crystallography operator) while the strong reflections provide
information about the average of the ncs mates.  If you mix the two
sets in your Rfree calculation the strong set will tend to dominate
and will obscure the consequences of allowing you ncs mates too much
freedom to differ.

I haven't had to deal with this situation but my first impression is to 
use the strong reflections for Rfree. For the strong reflections, and 
any normal data, Rwork & Rfree are dominated by model errors and not 
measurement errors. For the weak reflections measurement errors become 
more significant if not dominant. In that case Rwork & Rfree will not be 
a sensitive measure to judge model improvement and refinement strategy.

A second and possibly more important issue arises with determination of 
Sigmaa values for maximum likelihood refinement. Sigmaa values are 
related to the correlation between Fc and Fo amplitudes. When half of 
your observed data is systematically weakened then this correlation is 
going to be very high, even if the model is poor or completely wrong, as 
long as it obeys the same pseudo-translation. If you only use the strong 
reflections for Rfree I expect that should get around some of the issue.

Of course it can be valuable to also monitor the weak reflections to 
optimize NCS restraints but probably not to drive maximum likelihood 
refinement or to make general refinement strategy choices.

Bart

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] an over refined structure

2008-02-11 Thread Bart Hazes

Dale Tronrud wrote:

Bart Hazes wrote:

Dale Tronrud wrote:

[EMAIL PROTECTED] wrote:
 > Rotational near-crystallographic ncs is easy to handle this way, but
 > what about translational pseudo-symmetry (or should that be
 > pseudo-translational symmetry)? In such cases one whole set of 
spots is

 > systematically weaker than the other set.  Then what is the
 > "theoretically correct" way to calculate Rfree?  Write one's own 
code to

 > sort the spots into two piles?
 > Phoebe
 >

Dear Phoebe,

   I've always been a fan of splitting the test set in these situations.
The weak set of reflections provide information about the differences
between the ncs mates (and the deviation of the ncs operator from a
true crystallography operator) while the strong reflections provide
information about the average of the ncs mates.  If you mix the two
sets in your Rfree calculation the strong set will tend to dominate
and will obscure the consequences of allowing you ncs mates too much
freedom to differ.

I haven't had to deal with this situation but my first impression is 
to use the strong reflections for Rfree. For the strong reflections, 
and any normal data, Rwork & Rfree are dominated by model errors and 
not measurement errors. For the weak reflections measurement errors 
become more significant if not dominant. In that case Rwork & Rfree 
will not be a sensitive measure to judge model improvement and 
refinement strategy.

A second and possibly more important issue arises with determination 
of Sigmaa values for maximum likelihood refinement. Sigmaa values are 
related to the correlation between Fc and Fo amplitudes. When half of 
your observed data is systematically weakened then this correlation is 
going to be very high, even if the model is poor or completely wrong, 
as long as it obeys the same pseudo-translation. If you only use the 
strong reflections for Rfree I expect that should get around some of 
the issue.

Of course it can be valuable to also monitor the weak reflections to 
optimize NCS restraints but probably not to drive maximum likelihood 
refinement or to make general refinement strategy choices.

Bart

Dear Bart,

   I agree that the way one uses the test set depends critically on the
question you are asking.  In my letter I was focusing on that aspect
of the pseudo centered crystal problem where the strong/weak divide can
be used to particular advantage.

   I have not thought as much about the matter of using the test set
to estimate the level of uncertainty in the parameters of a given model.
My gut response is that the strong/weak distinction is still significant.
Since the weak reflections contain information about the differences
between the two, ncs related, copies I suspect that a great many systematic
"errors" are subtracted out.

   For example, if your model contains isotropic B's when, of course,
the atoms move anisotropically, your maps will contain difference features
due to these unmodeled motions.  Since the anisotropic motions are
probably common to the two molecules, these features will be present in
the average structure described by the strong reflections but will be
subtracted out in the "difference" structure described by the weak
reflections.  This argument implies to me that the strong reflections
need to be judged by the Sigma A derived from the strong test set and
the weak reflections judged by the weak test set.

Dale Tronrud

Hi Dale, I agree with the above but think there is yet another way to 
think about this which may also suggest a more general solution.

In a case of pseudo-translational NCS you can separate three effects.

1) An overal modulation of reflection intensity that depends just on the 
pseudo translation vector (x,y,z) and the reflection indices (h,k,l).
2) Deviations from pure translation due to the NCS axis not being 
perfectly parallel to the crystallographic axis.
3) Deviations from pure translation due to local structural deviations 
between NCS-related molecules.

The problems I was referring to are due to the first effect, which 
messes up the expected structure factor intensity distribution. Your 
comments relate to the second and especially third effects where the 
weak reflections inform about actual differences between the NCS-related 
molecules.

Sigmaa calculations already correct for expected intensity effects due 
to crystallographic symmetry but not for pseudo-translation effects. 
Given the translation component of the pseudotranslational symmetry it 
is possible to estimate the expected intensity for a reflection and I 
would think that could be used as a correction factor just like we do 
with normal crystallographic symmetry. This correction would basically 
transform the bimodal intensity distribution due to strong and weak 
subsets of reflections back to a normal unimodal distribution which, in 
an ideal world, only differ

Re: [ccp4bb] counting constraints?

2008-02-14 Thread Bart Hazes

Hi Pete,

In your example it would count as 4 restraints, not constraints, and
certainly not 4 observations or 4 parameters. It is not clear to me how
to quantify the information content in restraints, it probably depends
on the type of restraint and surely on the weight. Maybe information
theory has some ideas if you are really interested.
For real constraints, which fix parameters of the model one way or
another, it may be easier. For instance imposing exact NCS 2-fold
symmetry reduces the parameters by a factor of 2.

Bart

Meyer, Peter wrote:

Hi,

The recent discussion on Rwork/Rfree ratio reminded me of something I was
wondering about (*). When counting constraints as observations for determining
the observation to parameter ratio, is each unique constraint counted, or each
time a given constraint is used. For example, if there are 4 carbon oxygen
bonds (assuming the same parameters, let's say serine beta-carbon to serine
gamma-oxygen), would this count as 4 constraints as observations, or 1?

Intuitively, it seems to me like it should be counting unique constraints (although as near as I can tell these aren't listed in refmac5 logfiles). But I don't have a clear explanation for why, and of course I could be wrong on this.

Thanks,

Pete

* Rough translation - I'm about to ask another stupid question. Not like it's
the first time.

======

Re: [ccp4bb] counting constraints?

2008-02-14 Thread Bart Hazes


Thanks Ian,

I saw experimental observations and restraints (empirical observations) 
as completely different but I now see they are just two different 
sources of information that restrain the model parameters. So when you 
are counting they can all get "one vote" as long as the observations are 
independent. However, from an information content point of view the 
weight, or the standard deviation for Fobs, should still matter when 
multiple observations/restrains affect a model parameter.


For instance, torsion angle restraints tend to have broad distibutions 
leading to low weights and I would expect adding them as observations to 
the refinement will not contribute greatly to the "available 
information" to define the model, unless there is not much information 
from other sources to start with. In the recent posting of using 
secondary structure conformational restraints at 3.6A this may start to 
make a difference.


Bart

Ian Tickle wrote:

Peter, Bart

Actually the restraint weight doesn't affect the restraint count one
iota and as far as counting is concerned each restraint has exactly one
'vote' in the count.  However there is an important proviso: the
restraints must be completely independent to contribute fully to the
count.  Suppose you have a torsion restraint say on an methoxyphenyl
group (an example close to my heart since we have endless debates about
it!), and suppose the weight on the restraint is absolutely miniscule,
but still non-zero (we'd better say it's > than the machine precision to
avoid rounding problems).  Provided no other restraint or observation
(restraints and observations are of course essentially the same thing)
affects that torsion angle it will have its full effect, in fact the
effect won't depend on the weight.  Of course as soon as you have other
restraints which affect that same torsion angle they will compete with
each other depending on their relative weights, and you can't count them
as independent any more.

To answer Peter's original question each *active* restraint is counted.
The question of inactive restraints becomes relevant when considering
e.g. VDW restraints which normally only become active when the distance
becomes less than a threshold.

-- Ian



-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Bart Hazes

Sent: 14 February 2008 15:53
To: Meyer, Peter
Cc: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] counting constraints?

Hi Pete,

In your example it would count as 4 restraints, not constraints, and 
certainly not 4 observations or 4 parameters. It is not clear 
to me how 
to quantify the information content in restraints, it 
probably depends 
on the type of restraint and surely on the weight. Maybe information 
theory has some ideas if you are really interested.
For real constraints, which fix parameters of the model one way or 
another, it may be easier. For instance imposing exact NCS 2-fold 
symmetry reduces the parameters by a factor of 2.


Bart

Meyer, Peter wrote:


Hi,

The recent discussion on Rwork/Rfree ratio reminded me of 


something I was wondering about (*).  When counting 
constraints as observations for determining the observation 
to parameter ratio, is each unique constraint counted, or 
each time a given constraint is used.  For example, if there 
are 4 carbon oxygen bonds (assuming the same parameters, 
let's say serine beta-carbon to serine gamma-oxygen), would 
this count as 4 constraints as observations, or 1?


Intuitively, it seems to me like it should be counting 


unique constraints (although as near as I can tell these 
aren't listed in refmac5 logfiles).  But I don't have a clear 
explanation for why, and of course I could be wrong on this.  



Thanks,


Pete

* Rough translation - I'm about to ask another stupid 


question.  Not like it's the first time.





--

======


Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==







Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents. 
Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The

Re: [ccp4bb] counting constraints?

2008-02-15 Thread Bart Hazes

rpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents.
Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof.

Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park,
Cambridge CB4 0QA under number 3751674

Re: [ccp4bb] question about processing data

2008-03-17 Thread Bart Hazes


Melody Lin wrote:

Hi all,

I have always been wondering... for a data set diffracting to say 2.15 
Angstrom but in the highest resolution shell (2.25-2.15) the 
completeness is 74%, should I use merge all the data and call it a 2.15 
A dataset or I should cut the data set to say 2.25 A where the highest 
resolution shell has better completeness (>85%)? What is an acceptable 
completeness value for the highest resolution shell?


Thank you.

Best,
Melody


Hi Melody,

This reply is not aimed at you directly as this situation seems to have 
become systemic in the field. So thanks for bringing it up!



We can have a long, and mostly aimless, discussion on what resolution 
you should claim for your data set but DON'T throw away good data to 
make the statistics look better. At high resolution the statistics are 
supposed to get worse! What matters is if the data still contain useful 
information. The fact that 26% of the data is missing does not normally 
mean that anything is wrong with the 74% that you did measure. Perhaps 
you used a square detector and didn't place it close enough to capture 
the full resolution, or perhaps your diffraction pattern is anisotropic.


The only reason to throw out data is if they are too inaccurate for your 
purpose. When your data is used for phasing, especially anomalous 
phasing, there is reason to focus on data quality, but I see far too 
many native data sets that make poor use of the diffraction potential of 
the crystal. I thought this was due to people not properly collecting 
the data, but now it seems that people are simply throwing away good 
data because they don't like the statistics.


So my advice; if your high resolution shell data has poor completeness 
then check why this happened. If you did not collect the data properly 
then let it be a lesson for the next data collection trip. If it 
resulted from some issue of the crystal then decide if the measured data 
is messed up as well. If not then use all the data you trust, which 
means there is useful signal (I/SigI >1.5 or >2.0 depending who you talk 
to) and no problems leading to systematic errors or outliers.


Bart

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] writing scripts-off topic

2012-01-23 Thread Bart Hazes


On 12-01-23 09:59 PM, Ethan Merritt wrote:

On Monday, 23 January 2012, Yuri Pompeu wrote:

Hello Everyone,
I want to play around with some coding/programming. Just simple calculations 
from an
input PDB file, B factors averages, occupancies, molecular weight, so forth...
What should I use python,C++, visual basic?

What you describe is primarily a task of processing the text in a PDB file.
I would recommend perl, with python as a more trendy alternative.

If this is to be a springboard for a larger project, then you might choose
instead to use a standard library like cctbx to do the fiddly stuff and
call it from a higher level language (C or C++).

Ethan


I have used a number of languages and have found only one I really 
disliked, that being perl. It is hard for me to imagine that this 
language was developed by a linguist yet in my eyes it is the least 
natural language from human comprehension point of view. In contrast, 
python is much more intuitive and should be very suitable for the tasks 
you describe.


my 2p

Bart

Re: [ccp4bb] Crystallization robot and trypsin

2012-01-24 Thread Bart Hazes



  
  
On 12-01-24 08:39 AM, Regina Kettering wrote:

  
We have a Honeybee system but do not usually use
proteases.  The biggest problem we have found is that if
anything precipitates in the tips they have to be washed
very well, usually with water or ethanol.  The ceramic tip
can be washed using low concentrations of HCl (0.1M), which
I believe would also alleviate the protease problem.

  
Regina
  

 
  

At some point we aspirated an _expression_ plasmid, let it sit in the
tip for a while and then went through different washing protocols,
water, high-salt, 0.1M NaOH, and then dispense a small volume of
water into a PCR reaction. We found just a water wash got rid of
most of the DNA, as long as the outside of the tip was also washed.
After the high salt we could no longer detect the DNA by PCR.
Although protein is not the same as DNA I would have no concern
about protease contamination if you implement a good wash/strip
protocol for the protein dispense tip. 

Bart

Re: [ccp4bb] writing scripts-off topic

2012-01-24 Thread Bart Hazes


On 12-01-24 09:36 AM, Ian Tickle wrote:

On 24 January 2012 14:19, David Schuller  wrote:

On 01/24/12 00:41, Bart Hazes wrote:
www.cs.siue.edu/~astefik/papers/StefikPlateau2011.pdf

An Empirical Comparison of the Accuracy Rates of Novices using the Quorum,
Perl, and Randomo Programming Languages
A. Stefik, S. Siebert, M. Stefik, K. Slattery

Abstract: "... Perl users were unable to write programs more accurately than
those using a language designed by chance."

... and from the same paper: "Students tell us that the syntax they
are learning (in C++ at our school), makes no sense and some have
difficulty writing even basic computer programs.".

Maybe the difficulty the students in the test group experienced had a
lot to do with the fact they were complete novices and had no
experience whatsoever of any computer language; also the language
syntax was not explained to them prior to the test, they had to deduce
it during the test time itself (21 mins) from code samples.

Note that Python wasn't tested, so the fact that they found Perl
syntax difficult to deduce from the samples doesn't necessarily imply
that they wouldn't also have had the same difficulty with Python.  One
of the difficulties reported with Perl was the use of the 'for'
keyword to introduce a loop (novices apparently report that 'repeat'
is 7 times more intuitive than 'for').  But Python and Perl (and C/C++
of course) both use the 'for' keyword in a loop construct, and maybe
all this paper proves is that 'repeat' is easier to comprehend than
'for' (or any other loop syntax)!  I remember when I first learned
Fortran the 'DO' loop was the hardest concept to grasp (not so much
the syntax, but the concept of looping itself, with variables
potentially having different values on each pass through the loop):
this was followed closely by the FORMAT statement!  I think every
programming novice finds the concept of looping difficult at first in
whatever language they are using: you can often recognise novice code
because it studiously avoids the use of loops!

Personally I think there's plenty of opportunity to write bad (and
good) code in any language; for me the choice of language is a
personal one and not one I lose any sleep over.  Far more important to
me than writing beautiful code is getting the algorithm right,
particularly as it affects numerical precision.  Debugging the syntax
is the easy bit, debugging the algorithm is always the hard part.

Cheers

-- Ian

I agree with all of the above but it does not help someone who is asking 
about what language(s) to look into for a first foray into programming. 
From the original description Yuri is one of the many people who want 
to learn computer programming in general and apply it, probably on a 
rather infrequent basis, to relatively straightforward tasks that don't 
rely on speed, integration with the internet, special libraries etc. In 
such cases I think python should be at, or near, the top of the list of 
programming languages to look at and perl near the bottom. That doesn't 
mean python is not suitable for much more demanding tasks but if you are 
going to program on a daily basis or start on a project with special 
needs you really should study the strengths and weaknesses of each 
language in more detail.


Bart


PS: I think even novices should not have too much difficulty 
understanding how the example program below produces the results at the 
bottom.


languageList = ["python", "java", "...", "perl"]

print "Ranking of some computer languages based on ease of use"
i = 1
for language in languageList:
  print i, language
  i = i + 1


result

Ranking of some computer languages based on ease of use
1 python
2 java
3 ...
4 perl

Re: [ccp4bb] quasispecies

2012-01-24 Thread Bart Hazes


On 12-01-24 11:20 AM, Jacob Keller wrote:

Inspired by the recent post about "quasispecies:"

I have been bothered recently by the following problem: why do species
of genetic uniformity exist at all (or do they?)? This first came up
when I saw a Nature paper describing live bacteria extracted from a
supposedly 250-million-year-old salt crystal whose 16S RNA was 99%
identical to marismortui bacteria (ref below). What? Are the bacteria
the same now as 250 million years ago? But there is a further
question: given the assumptions of evolution, why should there be any
bacterium whose genome is the same as any other, assuming that
equivalent codons are really equivalent (or at least roughly so), and
that even at the protein level, there is such a thing as "neutral
drift?" After all, we even see in our lab cultures that they (at least
e coli) mutate fairly frequently, so why is there such a thing as "e
coli" at all, at least at the nucleotide level? I don't think we
usually say that each bacterial species is totally optimized in all
its features, do we? Even assuming that every single protein must be
just so, shouldn't there be as many species of e coli as there are
possible genomes encoding the same protein set, i.e. some extremely
large number? Why is there any uniformity at all? Or IS there--maybe
the bacteria too are only quasispecies...? And maybe also...

JPK



To my knowledge there is no universally accepted definition of a species 
but it certainly does not involved genetic identity. You use "genetic 
uniformity" but I am not sure how you define uniform. Even in a single 
generation, the chromosomes of a child have half a dozen or so mutations 
relative to the source chromosomes from its parents.
Many definitions focus on genetic isolation but that implies that in 
past centuries protestants and catholics were distinct species as 
inter-denominational marriage was a definite no-no. It also gets really 
messy for all life forms with non-sexual reproduction. For instance, if 
you consider a bacterium to reproduce clonally, then each individual is 
genetically isolated from every other, and thus there is one species per 
individual. In one form or another it all boils down to genetic affinity 
to a shared common ancestor, but that does not give you a neat criteria 
to define what is a species and some people are going to bring up 
horizontal gene transfer to further muddy the waters.


To get the official word on the concept of a quasi species you have to 
asked an evolutionary biologist, but since you were asking on the CCP4 
here is my interpretation:


For some RNA viruses the rate of mutation is so high that they basically 
sample a flat region of the fitness-landscape. If you could take two 
individual viruses out of this sample to establish two independent 
infections than over time each will start to re-sample the same flat 
landscape. In other words, there is not a single unique, or predominant, 
sequence that represents the species but a pool of "near-equal" fitness 
variants.


To some extend I feel that this is always the case but for "normal" 
organisms the sampling rate of fitness space is slow and genetic 
differences between individuals are dominated by mutations passed down 
by vertical descend. In contrast, if you sequence two viruses from the 
above two infections there genetic distance will be similar to the 
genetic distances between individuals within a single infection.


Bart

Re: [ccp4bb] sftools expand

2012-02-06 Thread Bart Hazes


On 12-02-06 08:37 AM, wtempel wrote:

Hello,
here is a question about the EXPAND command in SFTOOLS, specifically
its effect on a free reflection flag. Do the flag values get copied to
newly generated reflections based on symmetry, for example in the case
of a P622 ->  P6 expansion?
many thanks,
Wolfram Tempel

During expansion all data columns get copied as is, so Rfree flags will 
be as well (the only exception are phase columns which are adjusted as 
necessary when there are translational components to the 
crystallographic symmetry).


Bart

Re: [ccp4bb] choice of wavelength

2012-02-15 Thread Bart Hazes

Diffracted intensity goes up by the  cube of the wavelength, but so does 
absorption and I don't know exactly about radiation damage. One 
interesting point is that on image plate and CCD detectors the signal is 
also proportional to photon energy, so doubling the wavelength gives 8 
times diffraction intensity, but only 4 times the signal on integrating 
detectors (assuming the full photon energy is captured). So it would be 
interesting to see how the equation works out on the new counting 
detectors where the signal does not depend on photon energy. Another 
point to take into account is that beamlines can have different optimal 
wavelength ranges. Typically, your beamline guy/gal should be the one to 
ask. Maybe James Holton will chime in on this.


Bart

On 12-02-15 04:21 PM, Jacob Keller wrote:

Well, but there is more scattering with lower energy as well. The
salient parameter should probably be scattering per damage. I remember
reading some systematic studies a while back in which wavelength
choice ended up being insignificant, but perhaps there is more info
now, or perhaps I am remembering wrong?

Jacob

On Wed, Feb 15, 2012 at 5:14 PM, Bosch, Juergen  wrote:

No impact ? Longer wavelength more absorption more damage. But between the 
choices given no problem.
Spread of spots might be better with 1.0 versus 0.9 but that depends on your 
cell and also how big your detector is. Given your current resolution none of 
the mentioned issues are deal breakers.

Jürgen

..
Jürgen Bosch
Johns Hopkins Bloomberg School of Public Health
Department of Biochemistry&  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Phone: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-3655
http://web.mac.com/bosch_lab/

On Feb 15, 2012, at 18:08, "Jacob Keller"  
wrote:


I would say the better practice would be to collect higher
multiplicity/completeness, which should have a great impact on maps.
Just watch out for radiation damage though. I think the wavelength
will have no impact whatsoever.

JPK

On Wed, Feb 15, 2012 at 4:23 PM, Seungil Han  wrote:

All,
I am curious to hear what our CCP4 community thoughts are
I have a marginally diffracting protein crystal (3-3.5 Angstrom resolution)
and would like to squeeze in a few tenth of angstrom.
Given that I am working on crystal quality improvement, would different
wavelengths make any difference in resolution, for example 0.9 vs. 1.0
Angstrom at synchrotron?
Thanks.
Seungil



Seungil Han, Ph.D.

Pfizer Inc.

Eastern Point Road, MS8118W-228

Groton, CT 06340

Tel: 860-686-1788,  Fax: 860-686-2095

Email: seungil@pfizer.com





--
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] choice of wavelength

2012-02-16 Thread Bart Hazes


Hi Andrew,

I completely agree and it is what I meant by "(assuming the full photon 
energy is captured)". If the fraction of photons counted goes up at 
longer wavelengths than the relative benefit of using longer wavelength 
is even more pronounced on Pilatus. So for native data sets the 
wavelength sweet spot with a pilatus detector may be a bit longer then 
what used to be optimal for a given beamline on a previous generation 
detector.


Bart

On 12-02-16 02:09 AM, A Leslie wrote:


On 15 Feb 2012, at 23:55, Bart Hazes wrote:

Diffracted intensity goes up by the  cube of the wavelength, but so 
does absorption and I don't know exactly about radiation damage. One 
interesting point is that on image plate and CCD detectors the signal 
is also proportional to photon energy, so doubling the wavelength 
gives 8 times diffraction intensity, but only 4 times the signal on 
integrating detectors (assuming the full photon energy is captured). 
So it would be interesting to see how the equation works out on the 
new counting detectors where the signal does not depend on photon 
energy.



You make a good point about the variation in efficiency of the 
detectors, but I don't think your comment about the "new counting 
detectors" (assuming this refers to hybrid pixel detectors) is 
correct. The efficiency of the Pilatus detector, for example, falls 
off significantly at higher energies simply because the photons are 
not absorbed by the silicon (320  microns thick). The DQE for the 
Pilatus is quoted as 80% at 12KeV but only 50% at 16KeV and I think 
this variation is entirely (or at least mainly) due to the efficiency 
of absorption by the silicon.


Andrew



Another point to take into account is that beamlines can have 
different optimal wavelength ranges. Typically, your beamline guy/gal 
should be the one to ask. Maybe James Holton will chime in on this.


Bart

On 12-02-15 04:21 PM, Jacob Keller wrote:

Well, but there is more scattering with lower energy as well. The
salient parameter should probably be scattering per damage. I remember
reading some systematic studies a while back in which wavelength
choice ended up being insignificant, but perhaps there is more info
now, or perhaps I am remembering wrong?

Jacob

On Wed, Feb 15, 2012 at 5:14 PM, Bosch, Juergen  
wrote:
No impact ? Longer wavelength more absorption more damage. But 
between the choices given no problem.
Spread of spots might be better with 1.0 versus 0.9 but that 
depends on your cell and also how big your detector is. Given your 
current resolution none of the mentioned issues are deal breakers.


Jürgen

..
Jürgen Bosch
Johns Hopkins Bloomberg School of Public Health
Department of Biochemistry&  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Phone: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-3655
http://web.mac.com/bosch_lab/

On Feb 15, 2012, at 18:08, "Jacob 
Keller"  wrote:



I would say the better practice would be to collect higher
multiplicity/completeness, which should have a great impact on maps.
Just watch out for radiation damage though. I think the wavelength
will have no impact whatsoever.

JPK

On Wed, Feb 15, 2012 at 4:23 PM, Seungil Han  
wrote:

All,
I am curious to hear what our CCP4 community thoughts are
I have a marginally diffracting protein crystal (3-3.5 Angstrom 
resolution)

and would like to squeeze in a few tenth of angstrom.
Given that I am working on crystal quality improvement, would 
different
wavelengths make any difference in resolution, for example 0.9 
vs. 1.0

Angstrom at synchrotron?
Thanks.
Seungil



Seungil Han, Ph.D.

Pfizer Inc.

Eastern Point Road, MS8118W-228

Groton, CT 06340

Tel: 860-686-1788,  Fax: 860-686-2095

Email: seungil@pfizer.com





--
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu
***

Re: [ccp4bb] Modified Cys by TCEP?

2012-02-21 Thread Bart Hazes


On 12-02-21 10:11 AM, Björn Kauppi wrote:

Hi all,

I recently encountered a modified surface cysteine residue in one of my 
structures. The protein  is expressed in E.coli and my data is to 1.7Å so I am 
positive of the number of extra atoms. It really look like one of the tails 
(the Propanoic acid) of a TCEP molecule. TCEP was added during purification at 
2 mM. I tried to look in pdb (or google) but I could not find any reference of 
that TCEP could do this. The modified Cystein is involved in a nice crystal 
contact (salt bridge) to a Lysine, and thereby stabilizing it further. Looking 
back at older structures of the same protein, I see the same modification of 
this Cys, but not as pronounced since they were done at lower resolution.

Does anyone recognize this behavior of TCEP? Or, any other ideas?


Björn Kauppi
Structure and Design
Karo Bio AB


__
This e-mail may contain confidential information proprietary to
Karo Bio AB and is meant for the intended addressee(s) only. Any
unauthorized review, use, disclosure or distribution is prohibited.
If you have received this message in error, please advise the sender
and delete the e-mail and any attachments from your files. Thank you!
__



Did you use cacodylate as buffer. TCEP may have reduced some of the 
dimethylarsenate, which is really what cacodylate is, to 
dimethylarsenite which can react with cysteine residues. It works best 
if the cysteine has a depressed pKa, as it is reactive in the thiolate 
form, and the proximity to the lysine you mention might do just that.


There are several structures with arsenylated cysteine, the one we came 
across was a poxviral glutaredoxin (2HZE, Bacik & Hazes)


Bart

Re: [ccp4bb] Disulfide bonds

2012-03-04 Thread Bart Hazes


Hi Fred,

The SSBOND server has indeed been moved as we have relocated to a new 
building. SSBOND is still available at the new server address: 
http://hazeslab.med.ualberta.ca/forms/ssbond.html


This was my first ever program,with help from Bauke Dijkstra, and I was 
pleasantly surprised how many messages I got after the server 
relocation. SSBOND will soon get some competition for most used service 
as I am about to release some bioinformatics services.


Bart

On 12-03-04 02:36 AM, Frederic VELLIEUX wrote:

I'd google for "Bart Hazes" and "SSBOND" myself. There is (or was) a server, 
and the publication is Prot. Eng. 1988, 119, 25 (PMID 3244694). The server seems to be down or has 
moved.

HTH, Fred.


Message du 04/03/12 05:07
De : "Naveed A Nadvi"
A : CCP4BB@JISCMAIL.AC.UK
Copie à :
Objet : [ccp4bb] Disulfide bonds

Hello everyone,

I was wondering if there is any information available regarding the range of 
Ca-to-Ca distances between two cysteine residues forming disulfide bonds. Is 
there any software available for analysing the PDB for this kind of 
information? Some old textbooks suggest a distance of 4.4-6.8 A. I would very 
much appreciate any comments or suggestions you may have.

Regards,

Naveed

Faculty of Pharmacy,
The University of Sydney

Re: [ccp4bb] Expanding p4212 coordinates to p1

2012-04-12 Thread Bart Hazes



  
  
I am confused by the discussion on this message. Although it says
plane group I assume it really is a normal 3D tetragonal space
group, P42(1)2

So Eleanor's suggestion should work and sftools expand command will
do the job as well.

Bart

On 12-04-12 05:03 AM, Eleanor Dodson wrote:
Apologies - I didn't notice the plane group..
  Although I think if the sym ops are correctly listed in the
P4121 file they will just be applied as given..
  Eleanor

On 12 April 2012 11:58, Ian Tickle 
  wrote:
  I didn't realise the CCP4 suite could
handle the plane groups: where
are they listed (they're not in symop.lib or syminfo.lib)?
 Or are
they doing some clever projections of the space groups?

Cheers

-- Ian
  

  
On 12 April 2012 11:48, Eleanor Dodson 
wrote:
> Cad will do this correctly
> Reflection utilitied - merge mtz ( rather confusing
job title - sorry..)
> mtz in - the P4121.mtz data
> Output - P41212-ext.mtz
>
> Defne mtz output
>
> Select define limit for refl;action by Laue code
select P1
>
> then you will get a list of all P1 reflections with
phases correctly
> modified.
>
> If you really want to work in P1 you will need to
change the space group in
> P41212-ext.mtz
> Easiest way is
> mtzutils hklin1 P4121-ext.mtz hklout
P41212-ext-symP1.mtz
> SYMM P1
> end
>
> DONT change the G of P41212.mtz before doing the
extension - the program
> needs the P41212 sym ops to change the phases
correctly..
>
> If you want the coordinates extended too
> use pdbset to get a P1 set of cds..   (Coordinate
utilities
> Edit pdb
>  select pdbset and "generate chains by symmetry
operators"
> eleanor
> Eleanor
>
>
>
>
>
> On 12 April 2012 06:51, Henning Stahlberg 
> wrote:
>>
>> Hi Everybody,
>>
>> I am working in a P4212 plane group, which is a
p4 symmetric structure
>> with a screw axis in addition. My map has in
the unit cell two 4-fold
>> symmetric structures, one if which is
upside-down with respect to the other
>> one.   If using expand in sftools, then the
resulting map is p4 symmetrized
>> in the center between two p4-symmetric
structures, which is wrong. This is
>> probably the phase origin problem referred to
here:
>> http://www.ccp4.ac.uk/dist/html/sftools.html#expand
>>
>> Would anybody have a suggestion how I can
expand the p4212 MTZ file to
>> full p1, but with properly respecting the
symmetry phase origin?
>>
>> My illiterate guess would be to either
>> 1) shift the phase origin in the p4212 MTZ file
by (180.0;0.0); then use
>> sftools with "expand 1" to create the full
reflection sets in p1; then shift
>> back by (180.0;0.0);   or
>> 2) use sftools with the command "expand 1", and
then modify (invert?) the
>> phases of the reflections in the quadrants that
have wrong phases.
>>
>> In option 1) I am not sure if it is possible to
shift phase origin, while
>> staying in p4212 symmetry.  How could I do
that?
>> In option 2) I would probably get into deep
water the next time I work
>> with another symmetry, like p6212, or p2121,
where I would have to deal not
>> with quadrants but with triangles of 60deg
angle?
>>
>> Any suggestions would be highly appreciated.
>>
>> All the best,
>>
>> Henning.
>>

Re: [ccp4bb] Is there any easy to convert a colume in mtz file (say fom) into a fixed value?

2010-06-21 Thread Bart Hazes

There are probably multiple ways to do this. sftools makes this very 
easy. From the command line type


sftools
read input.mtz
calc col 6 = 0.9
write output.mtz
stop

Just change the file names, column number and the fixed value to 
whatever you need.


Bart

On 10-06-21 03:23 PM, Hailiang Zhang wrote:

Hi there:

Is there any easy to convert a colume in mtz file (say fom) into a fixed
value? I tried to convert to ascii first, but mtz2various only takes 1
single FP colume (unfortunately I have 2). Thanks!

Hailiang

   


--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Disulfide Designer Program

2010-08-23 Thread Bart Hazes


Hi Jacob,

Give SSBOND a try at http://129.128.24.248/forms/ssbond.html

Just upload your pdb and it will spit out pairs of residues that, if 
mutated to cysteine, are able to form disulfides. For multi-domain 
proteins I recommend to change residues numbers to 1-999 for chain A, 
1001-1999 for chain B, etc. since the chain labels are not printed out.


Bart

On 10-08-23 04:41 PM, Jacob Keller wrote:

Dear Crystallographers,

I remember having heard of a program which takes a given oligomeric 
assembly, and suggests optimum disulfides to stablize the complex. Can 
someone refresh my memory which program that is, and where it is 
available?


Best Regards,

Jacob Keller

***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***



--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Sftools can not handle non-standard settings?

2010-09-01 Thread Bart Hazes


Hi Herman,

As a former biomol user you might have guessed why. SFTOOLS found its 
origin as a transitional program helping the Groningen group move to the 
CCP4 mtz format. Since the Groningen MDF and CCP4 mtz had different 
ideas about space group symmetry and, especially, asymmetric unit 
definitions SFTOOLS needed to handle both. Since the biomol space group 
routine was basically a very large spaghetti of nested if-then-elses to 
accommodate all the peculiar choices I chose to reimplement using a 
simple set of symmetry generators and a matrix to define the asymmetric 
unit.


Since there is no longer need to support MDF, sftools could switch to 
use the ccp4 library but my code is used for many other things, 
determining if a reflection is (a)centric, on a symmetry axis, should be 
systematically absent, expected intensity, convertion to standard 
asymmetric unit etc. So this will be a major undertaking. Alternatively, 
you can create a list of symmetry generators and add space groups as 
Claus has apparently already done.


Bart

On 10-09-01 05:39 AM, herman.schreu...@sanofi-aventis.com wrote:

Dear Claus,

Thank you very much for this patch. We will install it, and I hope CCP4
will install it quickly as well ;-). Still I do not understand why
sftools has all symmetry operations hardcoded, while most other programs
use the CCP4 libraries. In that way, sftools would always be up to date
and would not need to be patched.

Best,
Herman

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
Claus Flensburg
Sent: Wednesday, September 01, 2010 1:18 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Sftools can not handle non-standard settings?

Dear Herman,

please find attached a patch for sftools that will add support for the
following non-standard space group settings:

A2, C21, I21, P2122, P2212, P21221, P22121

Note: the number for I21 follows the upcomming change to
syminfo.lib: 3004 ->  5005.

diffstat -p0<
CCP4-20091104-src-sftools_-sftools.f-Add-some-non-standard-spgrps-v1.pat
ch
  src/sftools_/sftools.f |   52
-
  1 files changed, 35 insertions(+), 17 deletions(-)

The patch applies equally well to series-6_1 and trunk.


Regards,

ClAuS

P.S. After applying the patch and compiling sftools, you can use it in
BUSTER with this command line option:

% refine autoBUSTER_Exe_sftools=/path/to/patched-sftools/sftools ...

On Wed, Sep 01, 2010 at 12:29:55PM +0200,
herman.schreu...@sanofi-aventis.com wrote:
   

Dear CCP4,

In our automated data processing and refinement pipeline, phaser
sometimes comes up with solutions in non-standard settings (e.g. P 21
2 21). These solutions subsequently fail in autobuster and it turned
out that this is because autobuster invokes sftools and sftools
apparently is not able to handle non-standard settings.

I am really puzzled. We upgraded to the latest CCP4 version (6.1.13)
and the symmetry libraries have P 21 2 21 (space group 2018) in them.
Other programs like reindex and coot handle this setting without any
 

problems.
   

Is sftools still supported by ccp4, or should we ask the buster people
 
   

to switch to some other program?

Thank you for your help,
Herman Schreuder
 
   


--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] How to automatically answer "NO" to SFTOOLS reading in a shell script?

2010-09-02 Thread Bart Hazes


Hi Hailiang

sftools was first and foremost designed to be used interactively, that 
is why it tends to follow a question and answer user interface. Of 
course you can use sftools in scripts but if it pops up a question that 
was not anticipated, the script it will crash. There could be a batch 
mode where you give sftools permission to make a best guess but guessing 
can be dangerous.


In your particular case, XPLOR uses a flag=1 for rfree reflections and 0 
for working set reflections while CCP4 MTZ by default uses a flag=0 for 
rfree reflections and non-zero for working set reflections. If sftools 
encounters an mtz that appears to use the XPLOR definition it gives a 
warning and suggests to convert to the CCP4 definition.


For proper MTZ files this should not happen and if it does happen maybe 
you should find out why.


Bart

On 10-09-02 02:00 PM, Hailiang Zhang wrote:

Hi,

I am reading a ccp4 mtz file using SFTOOLS. It asked me" Is this an XPLOR
RFREE flag column?". First I assume the answer is NO, since the input is a
CCP4 mtz file although the colume is for free-flags. Then, I am wondering
what is the script to automatically answer "NO" in a shell script.

Thanks!

Best Regards, Hailiang

   


--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] embarrassingly simple MAD phasing question (another)

2010-10-14 Thread Bart Hazes

e near field can give you a feeling for exactly
what a Fourier transform "looks like".  That is, not just the before-
and after- photos, but the "during".  It is also a very pretty movie,
which I have placed here:

http://bl831.als.lbl.gov/~jamesh/nearBragg/near2far.html

-James Holton
MAD Scientist

On 10/13/2010 7:42 PM, Jacob Keller wrote:

So let's say I am back in the good old days before computers,
hand-calculating the MIR phase of my first reflection--would I just
set that phase to zero, and go from there, i.e. that wave will
define/emanate from the origin? And why should I choose f000 over f010
or whatever else? Since I have no access to f000 experimentally, isn't
it strange to define its phase as 0 rather than some other reflection?

JPK

On Wed, Oct 13, 2010 at 7:27 PM, Lijun Liu  wrote:

When talking about the reflection phase:

While we are on embarrassingly simple questions, I have wondered for
a long
time what is the reference phase for reflections? I.e. a given phase
of say
45deg is 45deg relative to what?

=
Relative to a defined 0.

Is it the centrosymmetric phases?

=
Yes.  It is that of F(000).

Or a  theoretical wave from the origin?

=
No, it is a real one, detectable but not measurable.
Lijun


Jacob Keller

- Original Message -
From: "William Scott"
To:
Sent: Wednesday, October 13, 2010 3:58 PM
Subject: [ccp4bb] Summary : [ccp4bb] embarrassingly simple MAD 
phasing

question


Thanks for the overwhelming response.  I think I probably didn't
phrase the
question quite right, but I pieced together an answer to the 
question I

wanted to ask, which hopefully is right.


On Oct 13, 2010, at 1:14 PM, SHEPARD William wrote:

It is very simple, the structure factor for the anomalous 
scatterer is


FA = FN + F'A + iF"A (vector addition)

The vector F"A is by definition always +i (90 degrees anti-clockwise)
with

respect to the vector FN (normal scattering), and it represents the
phase

lag in the scattered wave.



So I guess I should have started by saying I knew f'' was 
imaginary, the

absorption term, and always needs to be 90 degrees in phase ahead of
the f'
(dispersive component).

So here is what I think the answer to my question is, if I understood
everyone correctly:

Starting with what everyone I guess thought I was asking,

FA = FN + F'A + iF"A (vector addition)

for an absorbing atom at the origin, FN (the standard atomic 
scattering

factor component) is purely real, and the f' dispersive term is
purely real,
and the f" absorption term is purely imaginary (and 90 degrees 
ahead).


Displacement from the origin rotates the resultant vector FA in the
complex
plane.  That implies each component in the vector summation is
rotated by
that same phase angle, since their magnitudes aren't changed from
displacement from the origin, and F" must still be perpendicular 
to F'.
Hence the absorption term F" is no longer pointed in the imaginary 
axis

direction.

Put slightly differently, the fundamental requirement is that the
positive
90 degree angle between f' and f" must always be maintained, but 
their

absolute orientations are only enforced for atoms at the origin.

Please correct me if this is wrong.

Also, since F" then has a projection upon the real axis, it now has a
real
component (and I guess this is also an explanation for why you 
don't get

this with centrosymmetric structures).

Thanks again for everyone's help.

-- Bill




William G. Scott
Professor
Department of Chemistry and Biochemistry
and The Center for the Molecular Biology of RNA
228 Sinsheimer Laboratories
University of California at Santa Cruz
Santa Cruz, California 95064
USA

phone:  +1-831-459-5367 (office)
 +1-831-459-5292 (lab)
fax:+1-831-4593139  (fax) =


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***

Lijun Liu
Cardiovascular Research Institute
University of California, San Francisco
1700 4th Street, Box 2532
San Francisco, CA 94158
Phone: (415)514-2836






***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***



--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] [QUAR] Re: [ccp4bb] embarrassingly simple MAD phasing question (another)

2010-10-14 Thread Bart Hazes

On 10-10-14 01:34 PM, Ethan Merritt wrote:

...

The contribution from normal scattering, f0, is strong at low resolution
but becomes weaker as the scattering angle increases.
The contribution from anomalous scattering, f' + f", is constant at
all scattering angles.

...

My simple/simplistic mental picture for this is that electrons form a cloud surrounding the atom's nucleus. The larger the diameter of the cloud the
more strongly the atomic scattering factor decreases with resolution (just
like increased B-factors spread out the electrons and reduce scattering).

Anomalous scattering is based on the inner electron orbitals that are much closer to the nucleus and thus their scattering declines more slowly with resolution. By this reasoning f' and f" would still decline with resolution but perhaps the difference is so substantial that within the resolution ranges we work with they can be considered constant.

By the same reasoning you'd expect neutron diffraction to have scattering factors that are for all practical purposes independent of resolution, assuming b-factors of zero.

In addition, the different fall-off in the scattering factors for f0 and f' or f" will be much less noticeable for anomalous scatters with high B-values where the latter dominates the 3D distribution of the electrons.

Bart

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone: 1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] quantum diffraction

2010-10-15 Thread Bart Hazes

The photon moves through the crystal in finite time and most of the time 
it keeps going without interacting with the crystal, i.e. no 
diffraction. However, if diffraction occurs it is instantaneous, or at 
least so fast as to consider it instantaneous. In some cases a 
diffracted photon diffracts another time while passing through the 
remainder of the crystal. Or in Ruppian terms, a poof-pop-poof-pop 
event. If you listen carefully you may be able to hear it.


Bart

On 10-10-15 12:43 PM, Jacob Keller wrote:

>but yes, each "photon" really does interact with
EVERY ELECTRON IN THE CRYSTAL at once.


A minor point: the interaction is not really "at once," is it? The 
photon does have to move through the crystal over a finite time.


JPK


--

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] quantum diffraction

2010-10-15 Thread Bart Hazes


On 10-10-15 10:37 AM, James Holton wrote:

...

  In fact, anyone with a Pilatus detector (and a lot of extra beam 
time) can verify the self-interference of photons in macromolecular 
crystal diffraction.  Since the source-to-detector distance of a 
typical MX beamline is about 30 m, it takes 100 nanoseconds for a 
"photon" generated in the storage ring to fly down the beam pipe, do 
whatever it is going to do in the crystal, and then (perhaps) 
increment a pixel on the detector.  So, as long as you keep the time 
between photons much greater than 100 nanoseconds you can be fairly 
confident that there is never more than one photon anywhere in the 
beamline at a given instant.


...
Does the length of the beamline really matter? As long as the photons 
are spaced apart more than the coherence length (several 1000 A to 
several 10um on a synchrotron beamline according to Bernard's post) they 
should be considered independent events. So the photon rate can probably 
be 5 to 6 orders of magnitude higher while still doing "single photon 
diffraction" experiments.


Bart

--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] quantum diffraction

2010-10-15 Thread Bart Hazes


On 10-10-15 02:14 PM, Dale Tronrud wrote:

...
The photon both diffracts and doesn't diffract as it passes through
the crystal and it diffracts into all the directions that match the Bragg
condition.  The wave function doesn't collapse to a single outcome until
the detector measures something - which in the scheme of things occurs
long after the photon left the crystal.

...

   

and

On 10-10-15 02:07 PM, Bryan Lepore wrote:

btw, buckyballs have measurable wave properties. i think they are trying virus 
particles now.
   


That reminds me that politicians also have wave properties

photons interact with electrons
their diffraction leads to interference
for most angles the results cancel out
when they are not on a common wavelength you get Laue diffraction
their is no single outcome until the detector measures something

politicons interact with the electorate
their diffrent fractions lead to interference
on most angles the results cancel out
when they are not on a common wavelength you get loud distraction
there is no single outcome until the polls measure something

Bart

--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] refining structures with engineered disulfide bonds

2010-10-20 Thread Bart Hazes

SSBOND actually predicts pairs of residues that, if mutated to
cysteine, can form a stable disulfide but does so on first principles,
not with a database of preferred conformations. (The server is at
http://129.128.24.248/forms/ssbond.html)

In this case the 2.7 Angstrom S-S distance is clearly too long for a
disulfide (~2.05 A) and too close for a VdW interaction between thiols.
There are papers on radiation damage that discuss different
configurations that have been observed including their likely chemical
nature. In similar cases we have used SHELX in the past to refine
occupancy for the different conformations, ensuring the bonded and
non-bonded sulfurs and Cbetas in the two cysteines have the
same occupancy and the sum for both conformations adds up to 1. There
have been several CCP4bb discussion on how to define disulfides in
REFMAC but this remains somewhat tricky (at least the last time we
needed it) and it does not refine occupancy.

Bart

On 10-10-20 03:34 PM, Frederic VELLIEUX wrote:
Bart Hazes wrote a program (and published as well, Hazes
& Dijkstra perhaps) called SSBOND I think. I cannot remember
exactly what the computer program does, but it certainly has a data
base of "possible" disulphide bond conformers. Hence I would myself
certainly check your second didulphide to see if one of these
conformers would explain the density better. As long as the model is
not into its proper conformation (or conformations if there are
several) then the density will not be optimal. There may be SS bond
conformer libraries in graphics programs, I do not know (never needed
that myself).

Fred.
  >
Message du 20/10/10 21:41
> De : "Seema Mittal" 
> A : CCP4BB@JISCMAIL.AC.UK
> Copie à : 
> Objet : [ccp4bb] refining structures with engineered disulfide
bonds
> 
> Hi All,

> 
I have engineered
intra-molecular disulfide bond in my protein monomer. The protein
functions as a homodimer. 

> 
 In crystal structure, there
is clean electron density for the S-S  bond in one monomer (bond length
2.2A), but there seems to be slightly messy density for the same in the
other monomer with (S-S bond length of 2.7A). An alternate conformation
of one of the cys seems plausible on the messy side. There is
considerable negative density associated with this region in both
monomers, more so on the messy side.   

> 
My question is :  do i need
to select additional parameters or define any sort of constraints
during refinement, in order to refine this introduced covalent bond?

> 
Thanks much for your help.

> 

> 
Best
Seema Mittal
Graduate Research Assistant
Department of Biochemistry & Molecular
Pharmacology,
> 970L
Lazare Research Building,
> University
of Massachusetts Medical School,
> 364
Plantation Street,
> Worcester,
MA 01605

-- 

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes

blem might be modelling solvent?
>> Alternatively/additionally, I wonder whether there also might
be more
>> variability molecule-to-molecule in proteins, which we may not
model well
>> either.
>>
>> JPK
>>
>> - Original Message - From: "George M. Sheldrick"
>> <gshe...@shelx.uni-ac.gwdg.de>
>> To: <CCP4BB@JISCMAIL.AC.UK>
>> Sent: Thursday, October 28, 2010 4:05 AM
>> Subject: Re: [ccp4bb] Against Method (R)
>>
>>
>> > It is instructive to look at what happens for small
molecules where
>> > there is often no solvent to worry about. They are often
refined
>> > using SHELXL, which does indeed print out the weighted
R-value based
>> > on intensities (wR2), the conventional unweighted R-value
R1 (based
>> > on F) and /, which it calls
R(sigma). For well-behaved
>> > crystals R1 is in the range 1-5% and R(merge) (based on
intensities)
>> > is in the range 3-9%. As you suggest, 0.5*R(sigma) could
be regarded
>> > as the lower attainable limit for R1 and this is indeed
the case in
>> > practice (the factor 0.5 approximately converts from I to
F). Rpim
>> > gives similar results to R(sigma), both attempt to
measure the
>> > precision of the MERGED data, which are what one is
refining against.
>> >
>> > George
>> >
>> > Prof. George M. Sheldrick FRS
>> > Dept. Structural Chemistry,
>> > University of Goettingen,
>> > Tammannstr. 4,
>> > D37077 Goettingen, Germany
>> > Tel. +49-551-39-3021 or -3068
>> > Fax. +49-551-39-22582
>> >
>> >
>> > On Wed, 27 Oct 2010, Ed Pozharski wrote:
>> >
>> > > On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft
wrote:
>> > > > the errors in our measurements apparently have
no
>> > > > bearing whatsoever on the errors in our models
>> > >
>> > > This would mean there is no point trying to get
better crystals, right?
>> > > Or am I also wrong to assume that the dataset with
higher I/sigma in the
>> > > highest resolution shell will give me a better model?
>> > >
>> > > On a related point - why is Rmerge considered to be
the limiting value
>> > > for the R?  Isn't Rmerge a poorly defined measure
itself that
>> > > deteriorates at least in some circumstances (e.g.
increased redundancy)?
>> > > Specifically, shouldn't "ideal" R approximate
0.5*/?
>> > >
>> > > Cheers,
>> > >
>> > > Ed.
>> > >
>> > >
>> > >
>> > > --
>> > > "I'd jump in myself, if I weren't so good at
whistling."
>> > >                                Julian, King of Lemurs
>> > >
>> > >
>>
>>
>> ***
>> Jacob Pearson Keller
>> Northwestern University
>> Medical Scientist Training Program
>> Dallos Laboratory
>> F. Searle 1-240
>> 2240 Campus Drive
>> Evanston IL 60208
>> lab: 847.491.2438
>> cel: 773.608.9185
>> email: j-kell...@northwestern.edu
>> ***
>>
>>
>


  
  
  


-- 



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes





You're second suggestion would be a good test because you are dealing
with data from the same crystal and can thus assume the structures are
identical (radiation damage excluded).
So, take a highly diffracting crystal and collect a short-exposure low
resolution data set and long exposure high resolution data set. Let's
say with I/Sig=2 at 2.0 and 1.2 high-resolution shells. Give the data
to two equally capable students to determine the structure by molecular
replacement from a, let's say 30% sequence identity starting model. You
could also use automated model building to be more objective and avoid
becoming unpopular with your students.

Proceed until each model is fully refined against its own data. Now run
some more refinement, without manual rebuilding, of the lowres model
versus the highres data (and perhaps some rigid body or other minimal
refinement of the highres model versus the lowres data, make sure R
& Rfree go down). I predict the highres model will fit the lowres
data noticeably better than the lowres model did and the lowres model,
even after refinement with the highres data, will not reach the same
quality as the highres model. Looking at Fo-Fc maps in the latter case
may give some hints as to which model errors were not recognized at 2A
resolution. You'll probably find peptide flips, mis-modeled leucine and
other side chains, dual conformations not recognized at 2A resolution,
more realistic B values, more waters ...

Bart

On 10-10-28 03:49 PM, Jacob Keller wrote:

  
  
  
  So let's say I take a 0.6 Ang
structure, artificially introduce noise into corresponding Fobs to make
the resolution go down to 2 Ang, and refine using the 0.6 Ang model--do
I actually get R's better than the artificially-inflated sigmas? Or
let's say I experimentally decrease I/sigma by attenuating the beam and
collect another data set--same situation?
   
  JPK
  
 
-
Original Message - 
From:
Bart Hazes 
To:
CCP4BB@JISCMAIL.AC.UK 
Sent:
Thursday, October 28, 2010 4:13 PM
Subject:
Re: [ccp4bb] Against Method (R)


There are many cases where people use a structure refined at high
resolution as a starting molecular replacement structure for a closely
related/same protein with a lower resolution data set and get
substantially better R statistics than you would expect for that
resolution. So one factor in the "R factor gap" is many small errors
that are introduced during model building and not recognized and fixed
later due to limited resolution. In a perfect world, refinement would
find the global minimum but in practice all these little errors get
stuck in local minima with distortions in neighboring atoms
compensating for the initial error and thereby hiding their existence.

Bart

On 10-10-28 11:33 AM, James Holton wrote:
It is important to remember that if you have
Gaussian-distributed errors and you plot error bars between +1 sigma
and -1 sigma (where "sigma" is the rms error), then you expect the
"right" curve to miss the error bars about 30% of the time.  This is
just a property of the Gaussian distribution: you expect a certain
small number of the errors to be large.  If the curve passes within the
bounds of every single one of your error bars, then your error
estimates are either too big, or the errors have a non-Gaussian
distribution.  
  
For example, if the noise in the data somehow had a uniform
distribution (always between +1 and -1), then no data point will ever
be "kicked" further than "1" away from the "right" curve.  In this
case, a data point more than "1" away from the curve is evidence that
you either have the wrong model (curve), or there is some other kind of
noise around (wrong "error model").
  
As someone who has spent a lot of time looking into how we measure
intensities, I think I can say with some considerable amount of
confidence that we are doing a pretty good job of estimating the
errors.  At least, they are certainly not off by an average of 40% (20%
in F).  You could do better than that estimating the intensities by eye!
  
Everybody seems to have their own favorite explanation for what I call
the "R factor gap": solvent, multi-confomer structures, absorption
effects, etc.  However, if you go through the literature (old and new)
you will find countless attempts to include more sophisticated versions
of each of these hypothetically "important" systematic errors, and in
none of these cases has anyone ever presented a physically reasonable
model that explained the observed spot intensities from a protein
crystal to within experimental error.  Or at least, if there is such a
paper, I haven't seen it.
  
Since there are so many possible things to "correct", what I would like
to find is a structure that represents the transition between the
"small molecul

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes


On 10-10-28 04:09 PM, Ethan Merritt wrote:

This I can answer based on experience.  One can take the coordinates from a 
structure
refined at near atomic resolution (~1.0A), including multiple conformations,
partial occupancy waters, etc, and use it to calculate R factors against a lower
resolution (say 2.5A) data set collected from an isomorphous crystal.  The
R factors from this total-rigid-body replacement will be better than anything 
you
could get from refinement against the lower resolution data.  In fact, 
refinement
from this starting point will just make the R factors worse.

What this tells us is that the crystallographic residuals can recognize a
better model when they see one. But our refinement programs are not good
enough to produce such a better model in the first place. Worsr, they are not
even good enough to avoid degrading the model.

That's essentially the same thing Bart said, perhaps a little more pessimistic 
:-)

cheers,

Ethan
   


Not pessimistic at all, just realistic and perhaps even optimistic for 
methods developers as apparently there is still quite a bit of progress 
that can be made by improving the "search strategy" during refinement.


During manual refinement I normally tell students not to bother about 
translating/rotating/torsioning atoms by just a tiny bit to make it fit 
better. Likewise there is no point in moving atoms a little bit to 
correct a distorted bond or bond length. If it needed to move that 
little bit the refinement program would have done it for you. Look for 
discreet errors in the problematic residue or its neighbors: peptide 
flips, 120 degree changes in side chain dihedrals, etc. If you can find 
and fix one of those errors a lot of the stereochemical distortions and 
non-ideal fit to density surrounding that residue will suddenly 
disappear as well.


The benefit of high resolution is that it is much easier to pick up and 
fix such errors (or not make them in the first place)


Bart

--

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Against Method (R)

2010-10-29 Thread Bart Hazes

On 10-10-29 12:03 AM, Robbie Joosten wrote:

  Hi
Bart,

I agree with the building strategy you propose, but at some point it
stops helping and a bit more attention to detail is needed. Reciprocal
space refinement doesn't seem to do the fine details. It always
surprises me how much atoms still move when you real-space refine a
refined model, especially the waters. I admit this is not a fair
comparison.

Does the water move back to its old position if you follow up the
real-space refinement with more reciprocal refinement. If so, the map
may not have been a true representation of reality. Basically what I
was implying is that if the required model changes "details" are such
that they fall within the radius of convergence then the atoms should
move to their correct positions; unless something is keeping them from
moving such as an incorrectly placed side chain that causes a steric
conflict. Fix the incorrect side chain and your "details" will take
care of themselves. I don't imply that I can always spot an easy error
to fix and sometimes end up rebuilding several different ways in the
hopes that one will resolve whatever was the problem. If that doesn't
happen at some point you need to give up, especially if it does not
affect a functionally important region. I do think it is good practice
to point out regions in the model that are problematic and have never
had reviewers complain about that if it is clear you made the effort to
get it as good as possible given the data.
High resolution data helps, but better data makes it
tempting to put too little effort in optimising the model. I've seen
some horribly obvious errors in hi-res models (more than 10 sigma
difference density peaks for misplaced side chains). At the same time
there are quite a lot of low-res models that are exceptionally good.

Can't blame the data for that, in the end each person (and supervisor)
need to take responsibility for the models they produce and deposit.
Same applies to sequence data bases that are full of lazy errors. If
humans are involved both greatness and stupidy are likely outcomes.

Bart

Cheers,
Robbie

> Date: Thu, 28 Oct 2010 16:32:04 -0600
> From: bart.ha...@ualberta.ca
> Subject: Re: [ccp4bb] Against Method (R)
> To: CCP4BB@JISCMAIL.AC.UK
> 
> On 10-10-28 04:09 PM, Ethan Merritt wrote:
> > This I can answer based on experience. One can take the
coordinates from a structure
> > refined at near atomic resolution (~1.0A), including multiple
conformations,
> > partial occupancy waters, etc, and use it to calculate R
factors against a lower
> > resolution (say 2.5A) data set collected from an isomorphous
crystal. The
> > R factors from this total-rigid-body replacement will be
better than anything you
> > could get from refinement against the lower resolution data.
In fact, refinement
> > from this starting point will just make the R factors worse.
> >
> > What this tells us is that the crystallographic residuals can
recognize a
> > better model when they see one. But our refinement programs
are not good
> > enough to produce such a better model in the first place.
Worsr, they are not
> > even good enough to avoid degrading the model.
> >
> > That's essentially the same thing Bart said, perhaps a little
more pessimistic :-)
> >
> > cheers,
> >
> > Ethan
> > 
> 
> Not pessimistic at all, just realistic and perhaps even optimistic
for 
> methods developers as apparently there is still quite a bit of
progress 
> that can be made by improving the "search strategy" during
refinement.
> 
> During manual refinement I normally tell students not to bother
about 
> translating/rotating/torsioning atoms by just a tiny bit to make
it fit 
> better. Likewise there is no point in moving atoms a little bit to

> correct a distorted bond or bond length. If it needed to move that

> little bit the refinement program would have done it for you. Look
for 
> discreet errors in the problematic residue or its neighbors:
peptide 
> flips, 120 degree changes in side chain dihedrals, etc. If you can
find 
> and fix one of those errors a lot of the stereochemical
distortions and 
> non-ideal fit to density surrounding that residue will suddenly 
> disappear as well.
> 
> The benefit of high resolution is that it is much easier to pick
up and 
> fix such errors (or not make them in the first place)
> 
> Bart
> 
> -- 
> 
>

> 
> Bart Hazes (Associate Professor)
> Dept. of Medical Microbiology& Immunology
> University of Alberta
> 1-15 Medical Sciences Building
> Edmonton, Alberta
> Canada, T6G 2H7
> phone: 1-780-492-0042
> fax: 1-780-492-7521
> 
>
===

Re: [ccp4bb] Strange spots

2010-10-29 Thread Bart Hazes


Hi Dave,

The circles are quite prominent in the "inner circle (lune)" that you 
highlight but not in the next one. The full image is too small to see 
details but I don't see any clear circular halos for any of the other 
lunes. If you start with the lune going through the origin and number it 
zero, then the one with the halos is lune 7. Any chance that there is a 
pseudo translation with a periodicity of ~1/7th of the reciprocal vector 
perpendicular to the planes that form the lunes.


The pseudo-2D lattice with the halos suggest 
hexagonal/trigonal/rhombohedral lattice and the halo radius is half the 
reciprocal unit cell length. Maybe you have some weird stacked 
rhombohedral packing where two or more rhombohedral cells intercalate so 
that there are pure rhombohedral unit cell axes that apply to all atoms 
and some pseudo translations that related an atom in one rhombohedral 
lattice to one in another stacked lattice. Don't know if that makes sense.


It would be interesting to see some images before and after this one. 
Are the halos circles in reciprocal space or spheres that just look like 
circles on the oscillation image because the intersection of a sphere 
with the Ewald sphere looks like a circle.
On images where the 6th, 5th etc lune forms a 2D lattice do you see the 
halo's or is it only on the 7th.


As John said you may well have a collectors item but it would sure be 
nice to know what caused this even if structure determination is not in 
the cards.


Bart

On 10-10-29 10:08 AM, David Goldstone wrote:

Dear All,

Does anyone have any insight into what the circles around the spots 
might be?


cheers

Dave


--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] I/sigmaI of >3.0 rule

2011-03-03 Thread Bart Hazes

There seems to be an epidemic of papers with I/Sigma > 3 (sometime much
larger). In fact such cases have become so frequent that I fear some
people start to believe that this is the proper procedure. I don't know
where that has come from as the I/Sigma ~ 2 criterion has been
established long ago and many consider that even a tad conservative. It
simply pains me to see people going to the most advanced synchrotrons to
boost their highest resolution data and then simply throw away much of it.

I don't know what has caused this wave of high I/Sigma threshold use but
here are some ideas

- High I/Sigma cutoffs are normal for (S/M)AD data sets where a more
strict focus on data quality is needed.

Perhaps some people have started to think this is the norm.

- For some dataset Rsym goes up strongly while I/SigI is still
reasonable. I personally believe this is due to radiation damage which
affects Rsym (which compares reflections taken after different amounts
of exposure) much more than I/SigI which is based on individual
reflections. A good test would be to see if processing only the first
half of the dataset improves Rsym (or better Rrim)

- Most detectors are square and if the detector is too far from the
crystal then the highest resolution data falls beyond the edges of the
detector. In this case one could, and should, still process data into
the corners of the detector. Data completeness at higher resolution may
suffer but each additional reflection still represents an extra
restraint in refinement and a Fourier term in the map. Due to crystal
symmetry the effect on completeness may even be less than expected.

Bart

On 11-03-03 04:29 AM, Roberto Battistutta wrote:

Dear all,
I got a reviewer comment that indicate the "need to refine the structures at an appropriate
resolution (I/sigmaI of>3.0), and re-submit the revised coordinate files to the PDB for
validation.". In the manuscript I present some crystal structures determined by molecular
replacement using the same protein in a different space group as search model. Does anyone know the
origin or the theoretical basis of this "I/sigmaI>3.0" rule for an appropriate resolution?
Thanks,
Bye,
Roberto.

Roberto Battistutta
Associate Professor
Department of Chemistry
University of Padua
via Marzolo 1, 35131 Padova - ITALY
tel. +39.049.8275265/67
fax. +39.049.8275239
roberto.battistu...@unipd.it
www.chimica.unipd.it/roberto.battistutta/
VIMM (Venetian Institute of Molecular Medicine)
via Orus 2, 35129 Padova - ITALY
tel. +39.049.7923236
fax +39.049.7923250
www.vimm.it

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology& Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone: 1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] I/sigmaI of >3.0 rule

2011-03-03 Thread Bart Hazes

higher redundancy lowers Rpim because it increases precision. However, 
it need not increase accuracy if the observations are not drawn from the 
"true distribution". If pathologic behaviour of Rfactor statistics is 
due to radiation damage, as I believe is often the case, we are 
combining observations that are no longer equivalent. If you used long 
exposures per image and collected just enough data for a complete data 
set you are out of luck. If you used shorter exposures and opted for a 
high-redundancy set then you have the option to toss out the last N 
images to get rid of the most damaged data, or you can try to compensate 
for the damage with zerodose, or whatever the name was of the program, I 
think from Wolfgang Kabsch.


Rejecting data is never desirable but I think it may be better than 
merging non-equivalent data that can't be properly modeled by a single 
structure.


Bart

On 11-03-03 12:34 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:

I don't like Rmeas either.

Given the Angst caused by actually useful redundancy, would it not be more
reasonable then to report Rpim which decreases with redundancy? Maybe Rpim
in an additional column would help to reduce the Angst?

BR

Maia

Tim Gruene wrote:

Hello Maia,

Rmerge is obsolete, so the reviewers had a good point to make you
publish Rmeas instead. Rmeas should replace Rmerge in my opinion.

The data statistics you sent show a mulltiplicity of about 20! Did you
check your data for radiation damage? That might explain why your
Rmeas is so utterly high while your I/sigI is still above 2 (You
should not cut your data but include
more!)

What do the statistics look like if you process just about enough
frames so that you get a reasonable mulltiplicity, 3-4, say?

Cheers, Tim

On Thu, Mar 03, 2011 at 10:57:37AM -0700, Maia Cherney wrote:


I see, there is no consensus about my data. Some people say 2.4A,
other say all. Well, I chose 2.3 A. My rule was to be a little bit
below Rmerg 100%. At 2.3A Rmerg was 98.7% Actually, I have published
my paper in JMB. Yes, reviewers did not like that and even made me
give Rrim and Rpim etc.

Maia



Bernhard Rupp (Hofkristallrat a.D.) wrote:


First of all I would ask a XDS expert for that because I don't know
exactly what stats the XDS program reports (shame on me, ok) nor
what the quality of your error model is, or what you want to use the
data for (I guess refinement - see Eleanor's response for that, and use

all data).

There is one point I'd like to make re cutoff: If one gets greedy
and collects too much noise in high resolution shells (like way
below  =
0.8 or so) the scaling/integration may suffer from an overabundance
of nonsense data, and here I believe it makes sense to select a
higher cutoff (like what exactly?) and reprocess the data. Maybe one
of our data collection specialist should comment on that.

BR

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf
Of Maia Cherney
Sent: Thursday, March 03, 2011 9:13 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] I/sigmaI of>3.0 rule

I have to resend my statistics.

Maia Cherney wrote:


Dear Bernhard

I am wondering where I should cut my data off. Here is the
statistics from XDS processing.

Maia



On 11-03-03 04:29 AM, Roberto Battistutta wrote:


Dear all,
I got a reviewer comment that indicate the "need to refine the
structures


at an appropriate resolution (I/sigmaI of>3.0), and re-submit the
revised coordinate files to the PDB for validation.". In the
manuscript I present some crystal structures determined by
molecular replacement using the same protein in a different space
group as search model. Does anyone know the origin or the
theoretical basis of this "I/sigmaI>3.0" rule for an appropriate
resolution?


Thanks,
Bye,
Roberto.


Roberto Battistutta
Associate Professor
Department of Chemistry
University of Padua
via Marzolo 1, 35131 Padova - ITALY tel. +39.049.8275265/67 fax.
+39.049.8275239 roberto.battistu...@unipd.it
www.chimica.unipd.it/roberto.battistutta/
VIMM (Venetian Institute of Molecular Medicine) via Orus 2,
35129 Padova - ITALY tel. +39.049.7923236 fax
+39.049.7923250 www.vimm.it








--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] ssDNA self-aneal

2011-03-17 Thread Bart Hazes



  
  
Has anyone looked at the kinetics of DNA annealing. Especially for
such short fragments I expect hairpin formation times to be on the
order of pico to nano seconds. Of course it doesn't hurt to
slow-cool but I wouldn't be too paranoid about it. Moreover, in this
particular case the hairpin is probably sufficiently unstable to
release and refold at a high rate at room temp so even less need to
slow cool. A bigger concern may be the relative affinity of the
intra-molecular hairpin versus the inter-molecular "primer dimer".
The former benefits from lower entropy loss, but the latter would
not need such a tight loop and possibly would form one additional
G-C basepair.

Bart



On 11-03-17 12:06 PM, Kevin Jude wrote:
I would bring up the DNA in TM buffer (10 mM Tris, 5
  mM MgCl2) or similar and anneal under dilute conditions to favor
  hairpin formation over dsDNA.  
  
  Fast cooling will also favor hairpin formation, so you may try
  heating to 95° and then cooling on ice, or using a short gradient
  on a thermocycler.  You can check your work on native PAGE
  (including appropriate controls, like oligos annealed under
  dsDNA-favoring conditions)
  
  On Thu, Mar 17, 2011 at 4:23 AM,
dengzq1987 <dengzq1...@gmail.com>
wrote:

  
Dear all,
 
Recently,i purchase Oligonucleotide from
company.the sequence  is TTGCGTAC GCAC GTACGC .i
want to perform self-annealing process to form
the following second structure .
5' TTGCGTACGC
  ||| |||    ]
3'   CGCATGCA 
 
i hope that most of the Oligonucleotide can form
  this structure,how can i achieve this?  any adivice is
  appreciated.
 
 
 
best wishes.
  
  

  
  


-- 

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] program to calculate electron density at x,y,z

2011-04-01 Thread Bart Hazes


Hi Ed,

I wrote a short program name HYDENS that takes a PDB file and an H K L 
amplitude phase file for a full hemisphere of data. You can make the 
latter from an MTZ with sftools. The program is on my website at 
http://129.128.24.248/highlights.html. There is a linux executable as 
well as the source code that should compile with any standard fortran 
compiler.


Bart

On 11-04-01 09:16 AM, Ed Pozharski wrote:

I need to calculate the electron density values for a list of spatial
locations (e.g. atom positions in a model) using an mtz-file that
already contains map coefficients.  To write my own code may be easier
than I think (if one can manipulate mtz columns, isn't the only problem
left how to incorporate symmetry-related reflections?), but I would need
an alternative at least for troubleshooting purposes. So,

Does anyone know of a software tool that can calculate point electron
density for every atom in a structure?

If I would have to bring a dependency into this, the best choice for me
would be clipper libs.

Thanks in advance,

Ed.




--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] program to calculate electron density at x,y,z (SUMMARY)

2011-04-12 Thread Bart Hazes

That is exactly what HYDENS is doing. A good interpolation with small 
grid steps should be equally good but with current computers and just a 
few hundred or even thousand points to evaluate, a classical Fourier 
summation is pretty fast and, for me, easier to program than a proper 
cubic-spline interpolation.

Bart

On 11-04-12 08:54 PM, Edward A. Berry wrote:

Pavel Afonine wrote:

Hi Ed,

yes, this is the eight-point interpolation, but since you can select to
choose very small grid step for the map calculation (grid_step
parameter), I hope this should be ok. If necessary, I can add an option
so it will give you the map value at the closest grid point instead of
interpolation or even both (although I guess the  latter would be too 
much).

What about doing the Fourier summation at the precise location requested,
in order to not calculate the map or interpolate at all?
Input would be the mtz file rather than map file.
eab

In the next build (dev-728 and up) it will be possible to use a PDB file
as a source of points.

Pavel.

On Tue, Apr 12, 2011 at 9:37 AM, Ed Pozharski mailto:epozh...@umaryland.edu>> wrote:

On Fri, 2011-04-08 at 18:06 -0700, Pavel Afonine wrote:
> phenix.map_value_at_point map_coeffs.mtz label="2FOFC" point="1 2 3"
> point="4 5 6"

Cool.  Afaiu, this is interpolation.  A useful extension would be
automatic picking of (x,y,z) from a pdb-file (a la mapman), 
although a
determined person can definitely come up with a script that 
converts a

pdb file into a list of "point" statements.

--
"Hurry up before we all come back to our senses!"
   Julian, King of Lemurs

--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Off topic about program for multiple protein sequence alignment

2012-05-15 Thread Bart Hazes

mafft and muscle are both faster than clustalw and on average more 
accurate. You can also use different options for mafft to push for speed 
or accuracy depending on your needs and patience. Tcoffee has a flavour 
that includes structural information if available to assist alignment. 
Another flavour, MCoffee, runs a set of different alignment programs, 
including clustalw, mafft ..., and assembles the most consistent 
alignment based on all of them. If you are dealing with a non-trivial 
alignment and its accuracy matters to you then you should always take a 
look at it because even the best programs sometimes make obvious 
mistakes. It can also pay off to use two alignment programs based on 
different methodologies. But don't waste too much time staring at poorly 
defined regions as they may simply not be structurally or evolutionary 
equivalent.


Bart

On 12-05-15 03:50 AM, martyn.w...@stfc.ac.uk wrote:

Certainly different programs and different scoring matrices will give different 
answers. There is not necessarily a correct answer either, just different 
educated guesses. With high sequence identity, the answers should be fairly 
consistent. As you reduce the sequence identity (i.e. as it gets more 
interesting) then the answers will vary more.

I think clustalw is generally considered to be one of the poorer multiple 
alignment programs these days (though I am sure opinions will vary here). By 
poor, I mean it struggles in the twilight zone of 25 - 30% seq identity, but is 
perfectly adequate for routine use. There are many other programs to choose 
from: probcons, mafft, TCoffee, Muscle, etc etc. In some tests of MrBUMP, we 
found some marginal cases that relied on using an alignment from probcons or 
mafft, rather than clustalw.

HTH
Martyn



-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
Tim Gruene
Sent: 15 May 2012 08:20
To: ccp4bb
Subject: Re: [ccp4bb] Off topic about program for multiple protein
sequence alignment

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Donghui,

even within one program (clustalw) you would get different results by
picking different weighting schemes (clustalx: Aligment->Alignment
Parameter ->  Multiple Alignment Parameter: BLOSUM, PAM, Gonnet...).

As with any software I would assume the developers know best what they
are doing and recommend to stick to the defaults unless you know what
you are doing.

Regards,
Tim

On 05/15/12 08:02, wu donghui wrote:

Dear all,

I want to know your suggestions about current protein sequence
alignment programs. It seems that different programs give different
alignment results such as from analysis of Clustal W and MULTALIN.
Thanks for any input or comments.

Best regards,

Donghui


- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPsgOYUxlJ7aRr7hoRArwKAKCqhK2012Ihnof7xjRzyao7GI8xpgCfZj/G
VfD3ly3bmqycO0mX888oYfE=
=7v6r
-END PGP SIGNATURE-

Re: [ccp4bb] Philosophical question

2013-03-19 Thread Bart Hazes

Just search for genetic code evolution in pubmed and you will find tons of
literature on it. The main driving force appears to have been to minimize
physico-chemical changes in amino acid properties for frequent mutations.
In other words, if you take mutation rates at the single-nucleotide level
and use it to predict, via a codon table, the rates of amino acid mutations
you will find that it correlates strongly with the observed amino acid
rates.

Bart

On Tue, Mar 19, 2013 at 8:34 AM, Jacob Keller <
j-kell...@fsm.northwestern.edu> wrote:

> Never one to shrink from philosophizing, I wonder generally why the codon
> conventions are the way they are? Is it like the QWERTY keyboard--basically
> an historical accident--or is there some more beautiful reason? One might
> argue that since basically all organisms share the convention (are there
> exceptions, even?), that it must be the "best of all possible" conventions.
> I have often wondered whether maybe this particular convention allows for
> the most effective pathways between proteins of significant function, e.g.,
> through the fewest mutations perhaps? One certainly cannot maintain that
> every possible protein sequence has been made at some time or another in
> the history of the biological world (go quantitate!) so there must be a way
> to ensure that mostly the "best" ones got made. On the other hand, since
> many organisms share DNA, maybe they had to "agree" on a system (I think
> this is the dogma?). Was there a "United Organisms" convention at some
> point, reminiscent of "Les Immortels" of the French language or POSIX or
> something, to ensure compliance? What was the penalty for non-compliance?
>
> Anyway, I like the question about the methionines,
>
> Jacob
>
> On Tue, Mar 19, 2013 at 9:46 AM, Edward A. Berry wrote:
>
>> Opher Gileadi wrote:
>>
>>> Hi Theresa,
>>>
>>> To add to Anat's comments: Although the AUG codon for the first
>>> methionine and all other methionines in a protein coding sequence look the
>>> same, they are read in a very different way by the ribosomal machinery. The
>>> first AUG is recognized by the initiation complex, which includes the
>>> separate small ribosomal subunit (40s), a special tRNA-methionine, and
>>> initiation factors (proteins) including eIF2. This leads to assembly of a
>>> complete ribosome and initiation of protein synthesis. Subsequently, in the
>>> process of elongation, AUG codons are read by a different tRNA, which is
>>> brought to the 80s ribosome bound to a protein called elongation factor 1a.
>>> This is an oversimplification, of course, but the point is that the
>>> initiation codon (=the first amino acid to be incorporated to the protein)
>>> is read by a special tRNA, hence the universal use of methionine.
>>>
>>> Opher
>>>
>>>  Yes, but why methionine? Half the time it has to be removed by
>> N-terminal peptidase to give a small first residue, or by leader sequence
>> processing. Why use a big expensive amino acid instead of choosing one of
>> the glycine codons? Is there an obvious reason, or just "it had to be
>> something, and Met happened to get selected"?
>>
>> And why sometimes alternate start codons can be used? and why doesn't
>> initiation occur also at methionines in the middle of proteins? I'm
>> guessing it has to do with 5' untranslated region and ribosome binding
>> sites. So could the start codon actually be anything you want, provided
>> there is a strong ribosome binding site there?
>>
>> Just being philosophical, and not afraid to display my ignorance,
>> eab
>>
>
>
>
> --
> ***
>
> Jacob Pearson Keller, PhD
>
> Looger Lab/HHMI Janelia Farms Research Campus
>
> 19700 Helix Dr, Ashburn, VA 20147
>
> email: kell...@janelia.hhmi.org
>
> ***
>



-- 

Bart Hazes
Associate Professor
Dept. of Medical Microbiology & Immunology
University of Alberta

Re: [ccp4bb] Philosophical question

2013-03-19 Thread Bart Hazes

It is so intolerant to change because reassigning a codon to a different
amino acid type or stop codon affects thousands of proteins that use that
codon simultaneously. The probably that none of those mutations are
deleterious is extremely small.

Genetic code changes are more common in the mitochondrial code. First of
all the mitochondrial genome is much smaller, ~16kb for vertebrates.
Moreover, in cases I have looked at the change in codon use seems to happen
when first there is a case of extreme bias against using a codon. When a
codon is (almost) not used at all it can be re-purposed without affecting
any proteins.

Bart

On Tue, Mar 19, 2013 at 2:05 PM, Jacob Keller <
j-kell...@fsm.northwestern.edu> wrote:

> I don't understand this argument, as it would apply equally to all
> features of the theoretical LUCA
>
>> No it won't.  Different features would have different tolerance levels to
>> modifications.
>
>
> Yes, this "tolerance" is the second (hidden or implicit) principle I
> referred to. So you'd have to explain why the codon convention is so
> intolerant/invariant relative to the other features--it seems to me that
> either it is at an optimum or there is some big barrier holding it in
> place. And you'd have to explain this without invoking interchange of DNA,
> viruses, etc, as we're talking about a LUCA here, right? And you'll have to
> make sure that whatever reason you invoke cannot be applied to other
> features of this LUCA which are indeed seen to be variable.
>
> JPK
>
>
> ***
>
> Jacob Pearson Keller, PhD
>
> Looger Lab/HHMI Janelia Farms Research Campus
>
> 19700 Helix Dr, Ashburn, VA 20147
>
> email: kell...@janelia.hhmi.org
>
> ***
>

-- 

Bart Hazes
Associate Professor
Dept. of Medical Microbiology & Immunology
University of Alberta

Re: [ccp4bb] twinned?

2008-04-02 Thread Bart Hazes


Hi Qiang,

A normal data set has a unimodal intensity distribution with a 
predictable shape. When there is twinning the distribution remains 
unimodal but becomes sharper and this is picked up in the twinning 
analysis. When there is pseudo-translational symmetry, as you indicate 
you have, then the intensity distribution becomes bimodal with one set 
of reflections systematically strengthened and another systematically 
weakened. This makes the whole distribution broader, just the opposite 
of what twinning does, and therefore shows up as "negative twinning" in 
the analysis.


Bart

Qiang Chen wrote:

Hi all,

The data I am working on has a strong translation vector. The space group
is C2221 and resolution is 2.3 angstrom. There are two molecules per AU
with a pseudo-2-fold axis.
On the cumulative intensity distribution plot, the theor and obser curves
totally do not overlap. I did "detect_twinning" from CNS, and there is the
result:

  <|I|^2>/(<|I|>)^2  = 3.2236 (2.0   for untwinned, 1.5   for twinned)
  (<|F|>)^2/<|F|^2>  = 0.6937 (0.785 for untwinned, 0.865 for twinned)
Does the result mean my data is not twinned?

Any suggestion will be highly appreciated.
Thank you!

The information transmitted in this electronic communication is intended only
for the person or entity to whom it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or other
use of or taking of any action in reliance upon this information by persons or
entities other than the intended recipient is prohibited. If you received this
information in error, please contact the Compliance HelpLine at 800-856-1983 and
properly dispose of this information.





--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] twinned?

2008-04-03 Thread Bart Hazes

I just realized that this is an orthorhombic C222(1) space group. I 
didn't check it up but unless two of the cell-dimensions are nearly 
identical I think merohedral twinning is not possible for this space 
group, because the symmetry of the unit cell shape is not higher than 
the symmetry of the space group.


Bart

Eleanor Dodson wrote:
It is not really possible to detect twinning by the simple moment and 
cumulative distribution tests for  data from a crystal with pseudo 
translation. As Bart says, twinning  decreases the value of the moments, 
whilst pseudo-translation increases them, so the two effects tend to 
cancel out. There is a reference to the L test: J. Padilla & T. O. 
Yeates. A statistic for local intensity differences: robustness to 
anisotropy and pseudo-centering and utility for detecting twinning. 
/Acta Crystallogr./ *D59*, 1124-30, 2003. 
<http://scripts.iucr.org/cgi-bin/paper?S0907444903007947>S They suggest 
using neighbouring reflections pairs to test .   This can often overcome 
the problem associated with pseudo-translation. However it is quite 
sensitive to data quality.

See http://nihserver.mbi.ucla.edu/pystats/

 Eleanor


Bart Hazes wrote:


Hi Qiang,

A normal data set has a unimodal intensity distribution with a 
predictable shape. When there is twinning the distribution remains 
unimodal but becomes sharper and this is picked up in the twinning 
analysis. When there is pseudo-translational symmetry, as you indicate 
you have, then the intensity distribution becomes bimodal with one set 
of reflections systematically strengthened and another systematically 
weakened. This makes the whole distribution broader, just the opposite 
of what twinning does, and therefore shows up as "negative twinning" 
in the analysis.


Bart

Qiang Chen wrote:


Hi all,

The data I am working on has a strong translation vector. The space 
group

is C2221 and resolution is 2.3 angstrom. There are two molecules per AU
with a pseudo-2-fold axis.
On the cumulative intensity distribution plot, the theor and obser 
curves
totally do not overlap. I did "detect_twinning" from CNS, and there 
is the

result:

  <|I|^2>/(<|I|>)^2  = 3.2236 (2.0   for untwinned, 1.5   for twinned)
  (<|F|>)^2/<|F|^2>  = 0.6937 (0.785 for untwinned, 0.865 for twinned)
Does the result mean my data is not twinned?

Any suggestion will be highly appreciated.
Thank you!

The information transmitted in this electronic communication is 
intended only
for the person or entity to whom it is addressed and may contain 
confidential
and/or privileged material. Any review, retransmission, dissemination 
or other
use of or taking of any action in reliance upon this information by 
persons or
entities other than the intended recipient is prohibited. If you 
received this
information in error, please contact the Compliance HelpLine at 
800-856-1983 and

properly dispose of this information.











--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] twinned?

2008-04-03 Thread Bart Hazes

Good point, if you mis-assign a P2(1) space group as C222(1) because of 
the twinning-generated apparent extra 2-fold symmetry then you could get 
into such a situation.


If the P2(1) space group only has space for a monomer in the asymmetric 
unit then Vm will point out the problem, but if the monoclinic cell 
already has NCS then this can be tricky.


If the monoclinic cell has 2-fold NCS with pseudo-222 characteristics 
then it may be almost impossible to detect twinning because the 
twin-related reflections will be strongly correlated. As a result, 
averaging the twin-related reflections will not affect the intensity 
distribution and the twinning analysis will fail.


However, if the pseudo-symmetry deviates only slightly from 
crystallographic symmetry, you may end up happily solving the structure, 
with litle evidence that there even was a problem. The final structure 
would be largely correct apart from areas where the pseudo-symmetry 
deviates from true crystallographic symmetry.


Bart

Poul Nissen wrote:
Check this paper below - a C222(1) space group (a=212, b= 300, c=575) 
frequently appearing as a merohedral twin P2(1) with apparent C222(1) 
symmetry was exactly a major problem in the H. marismortui 50S structure 
determination. 


Poul

Ban N, Nissen P, Hansen J, Capel M, Moore PB, Steitz TA. 
<http://www.ncbi.nlm.nih.gov/pubmed/10476961?ordinalpos=6&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum>
Abstract 
<http://www.ncbi.nlm.nih.gov/pubmed/10476961?ordinalpos=6&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum>
Placement of protein and RNA structures into a 5 A-resolution map of the 
50S ribosomal subunit.

Nature. 1999 Aug 26;400(6747):841-7.


On 03/04/2008, at 17.48, Bart Hazes wrote:

I just realized that this is an orthorhombic C222(1) space group. I 
didn't check it up but unless two of the cell-dimensions are nearly 
identical I think merohedral twinning is not possible for this space 
group, because the symmetry of the unit cell shape is not higher than 
the symmetry of the space group.


Bart

Eleanor Dodson wrote:

It is not really possible to detect twinning by the simple moment and 
cumulative distribution tests for  data from a crystal with pseudo 
translation. As Bart says, twinning  decreases the value of the 
moments, whilst pseudo-translation increases them, so the two effects 
tend to cancel out. There is a reference to the L test: J. Padilla & 
T. O. Yeates. A statistic for local intensity differences: robustness 
to anisotropy and pseudo-centering and utility for detecting 
twinning. /Acta Crystallogr./ *D59*, 1124-30, 2003. 
<http://scripts.iucr.org/cgi-bin/paper?S0907444903007947>S They 
suggest using neighbouring reflections pairs to test .   This can 
often overcome the problem associated with pseudo-translation. 
However it is quite sensitive to data quality.



See http://nihserver.mbi.ucla.edu/pystats/



Eleanor



Bart Hazes wrote:



Hi Qiang,




A normal data set has a unimodal intensity distribution with a 
predictable shape. When there is twinning the distribution remains 
unimodal but becomes sharper and this is picked up in the twinning 
analysis. When there is pseudo-translational symmetry, as you 
indicate you have, then the intensity distribution becomes bimodal 
with one set of reflections systematically strengthened and another 
systematically weakened. This makes the whole distribution broader, 
just the opposite of what twinning does, and therefore shows up as 
"negative twinning" in the analysis.




Bart




Qiang Chen wrote:





Hi all,




The data I am working on has a strong translation vector. The space 
group



is C2221 and resolution is 2.3 angstrom. There are two molecules per AU



with a pseudo-2-fold axis.


On the cumulative intensity distribution plot, the theor and obser 
curves


totally do not overlap. I did "detect_twinning" from CNS, and there 
is the



result:




 <|I|^2>/(<|I|>)^2  = 3.2236 (2.0   for untwinned, 1.5   for twinned)



 (<|F|>)^2/<|F|^2>  = 0.6937 (0.785 for untwinned, 0.865 for twinned)



Does the result mean my data is not twinned?




Any suggestion will be highly appreciated.



Thank you!




The information transmitted in this electronic communication is 
intended only


for the person or entity to whom it is addressed and may contain 
confidential


and/or privileged material. Any review, retransmission, 
dissemination or other


use of or taking of any action in reliance upon this information by 
persons or


entities other than the intended recipient is prohibited. If you 
received this


information in error, please contact the Compliance HelpLine at 
800-856-1983 and



properly dispose of this information.











--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology &

Re: [ccp4bb] which concentrated salt has lowest vapour pressure?

2008-04-07 Thread Bart Hazes


George M. Sheldrick wrote:
Li salts tend to increase the solubility of peptides (Seebach et al., 
Helv. Chim. Acta 72 (1989) 857-867), which is a pity, because they can 
also be used as cryoprotectants.


George


But that is not an issue if you add the salt only to the reservoir. The 
same is not necessarily true for AS, because ammonium is volatile. Janet 
Newman was successful with AS in her paper (Acta Cryst D61, 490-3) but 
for us it didn't work well (Dunlop & Hazes, Acta Cryst D61, 1041-8, 2005).
We did take some measurements to vapour pressure reduction caused by 
various reagents. All PEGS are VERY poor. There is some variation 
between salts but the effect is clearly concentration dependent, as it 
should, so if you want the maximum dehydrating power, I would still go 
with LiCl. However, unless there is already a lot of salt in the 
crystallization drops you may dry out the drops completely if you push 
it too far.


Bart


Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582


On Mon, 7 Apr 2008, Kay Diederichs wrote:



Dear all,

a protein which we work on is available in low quantity. The only
crystallization screen we set up is completely clear, no precipitate, nothing.

Now we would like to modify the reservoirs of this screen, by adding LiCl or
Ammoniumsulfate or ... , with the goal of reducing the vapour pressure, to at
least get the protein concentration in the drop into the range where
"something happens".

Does anyone have advice as to which salt we should add (to the reservoir
only)? AmSO4 is only soluble to 4M, LiCl goes to 10M. But vapour pressure
reduction is not the same as molarity.

thanks for any insight,

Kay
--
Kay Diederichshttp://strucbio.biologie.uni-konstanz.de
email: [EMAIL PROTECTED]Tel +49 7531 88 4049 Fax 3183
Fachbereich Biologie, Universität Konstanz, Box M647, D-78457 Konstanz




--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] crystallisation robot

2008-04-14 Thread Bart Hazes


Hi Joe,

We have a 32-head Honeybee robot which sets up a 96-well plate with a 
single drop per well in ~6 minutes and 3 drops per well at ~9 minutes. A 
96-head phoenix or hummingbird-like system is likely going to be faster, 
but not by that much. Our robot has its own humidified cabinet and as 
long as you keep evaporation under control I don't think set-up speed, 
within reason, is that important.


Bart

JOE CRYSTAL wrote:

Hi,


Does anyone have information about how long it takes to set up a 96-well 
tray for the crystallization robots available?  Besides cost per tray 
and maintenance cost, another important feature we consider is the time 
for setting up a 96-well tray.  It is an important factor since we are 
talking about sub-microliter drops.



Best,


Joe

On Fri, Jan 18, 2008 at 12:28 PM, Lisa A Nagy <[EMAIL PROTECTED] 
<mailto:[EMAIL PROTECTED]>> wrote:


Al's Oil on the plates:
What a nightmare!!!
The oil creeps up the plate and over the sides. It dissolves adhesives.
It makes me say bad words in multiple languages.
Bigger drops + no oil = fewer bad words.

Lisa
--
Lisa A. Nagy, Ph.D.
University of Alabama-Birmingham
[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK
<mailto:CCP4BB@JISCMAIL.AC.UK>] On Behalf Of
Patrick Shaw Stewart
Sent: Friday, January 18, 2008 2:20 AM
To: CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: [ccp4bb] Fwd: [ccp4bb] crystallisation robot

One thing that people often overlook is that quite a lot of protein
can be lost by denaturation on the surface of the drop.  This is more
significant for smaller drops.  Two suggestions: (1) increase the
proportion of protein in the - technical term - teeny drop to say two
thirds and (2) cover the drops with oil eg Al's oils
(silicone/paraffin).  You still get vapor diffusion though the oil ,
and you'd like to slow up equilibration.  of course (2) slows up the
robotics a little, but both should be trivial to set up..





--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Negative density around C of COO-

2008-05-05 Thread Bart Hazes


Hi James,

We used to talk about primary and secondary radiation damage. The former 
operates at room temperature where free radicals were said to be formed 
in solution and diffuse around to damage proteins. Under cryoconditions 
this no longer happens, leading to greatly improved crystal life time, 
but we still have primary radiation damage, with the photons directly 
hitting the protein.


It was my understanding that this was still considered to form a free 
radical at the affected atom without there being any diffusion involved. 
Sulfur atoms would be more sensitive as they have a larger X-ray 
cross-section or because they may act as free-radical sinks where free 
radicals generated nearby strip an electron from the sulfur, thereby 
satisfying their own electronic configuration and converting the sulfur 
into a radical state. For instance, in ribonucleotide reductase a 
tyrosine free radical is formed spontaneously (using oxygen and an 
dinuclear iron site) and "jumps" over 20 Angstrom from one subunit to 
another to form a thiyl free radical in the active site. It then "jumps" 
back to the tyrosine upon completion of the catalytic cycle. Although we 
don't know how it jumps, certainly not by diffusion, there is general 
agreement that it does happen.


People have also observed broken disulfides in cryocrystal structures 
with the sulfurs at a distance that is too long for a disulfide but too 
short for a normal non-bonded sulfur-sulfur interaction. I seem to 
remember that this distance was suggested to indicate the presence of a 
thiyl free radical. I'm no chemist of physicist so can't evaluate if 
that claim is reasonable but if it is then that would be direct evidence 
to support the involvement of a free radical state in radiation damage.


So I guess my questions/comments are
- what are the great many good reasons to think that free radicals do 
NOT play a role in radiation damage under cryo.
- although diffusion does not happen below 130K, radicals do appear to 
teleport, at least over short distances.


Bart

James Holton wrote:
I don't mean to single anyone out, but the assignment of "free radicals" 
as the species mediating radiation damage at cryo temperatures is a "pet 
peeve" of mine.  Free radicals have been shown to mediate damage at room 
temperature (and there is a VERY large body of literature on this), but 
there are a great many good reasons to think that free radicals do NOT 
play a role in radiation damage under cryo.


This "assignment" of free radicals to damage is often made (flippantly) 
in the literature, but I feel a strong need to point out that there is 
NO EVIDENCE of a free radical diffusion mechanism for radiation damage 
below ~130K.  To the contrary there is a great deal of evidence that 
water, buffers and protein crystals below ~130 K are in a state of 
matter known as a "solid", and molecules (such as free radicals) do not 
diffuse through solids (except on geological timescales).  If you are 
worried that the x-ray beam is heating your crystal to >130 K, then have 
a look at Snell et. al. JSR 14 109-15 (2007).  They showed quite 
convincingly that this just can't happen for anything but the most 
exotic situations.


There is evidence, however, of energy transfer taking place between 
different regions of the crystal, but energy transfer does not require 
molecular diffusion or any other kind of mass transport.  In fact, 
solid-state chemistry is generally mediated by cascading 
neighbor-to-neighbor reactions that do not involve "diffusion" in the 
traditional sense.  Electricity is an example of this kind of chemistry, 
and these reactions are a LOT faster than diffusion.  The closest 
analogy to "diffusion" is that the propagating reaction can be seen as a 
"species" of sorts that is moving around inside the sample.  Entities 
like this are formally called quasiparticles.  Some quasiparticles are 
charged, but others are not.  If you don't know what a quasiparticle is, 
you can look them up in wikipedia.
Some have tried to rescue the "free radical" statements about radiation 
damage by claiming that individual electrons are "radicals".  I guess 
this must come from the "pressure" of such a large body of free-radical 
literature at room temperature.  However, IMHO this is about as useful 
as declaring that every chemical reaction is a "free radical" reaction 
(since they involve the movement of electrons).   I think it best that 
we try to call the chemistry what it is and try to stamp out rumors that 
mechanisms are known when in reality they are not.


Just my little rant.

-James Holton
MAD Scientist





--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] mutation to cysteines

2008-05-29 Thread Bart Hazes


amit sharma wrote:

Dear All,
 
Sorry for a non-CCP4 question. I intend to mutate a couple of residues 
to cysteines(so that they form a disulphide linkage) in a certain region 
of my protein. Could I please be directed to program(s) that would 
reliably let me do that, prior to primer designing?
 
Thanks in advance,

Amit
 
 


One more option is to use my first ever fortran program SSBOND. You can 
use it via a web interface at

http://eagle.mmid.med.ualberta.ca/forms/ssbond.html

It has been cited 57 times, I didn't check them all, but from what I've 
heard it works well.


Bart


======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Using multiple crystals for structure solution in P1 using MAD/SAS/SAD

2008-07-23 Thread Bart Hazes

Increasing redundancy only helps if all data draw from the same 
distribution so you get a more accurate estimate of the mean of the 
distribution. When dealing with different crystals, crystal-to-crystal 
variation is likely larger than the anomalous signal you are looking for 
and I'm therefore not convinced that merging of data is a good idea 
(never hurts to try though).


I wonder if it would work better to derive anomalous differences for the 
individual data sets first and then merge those anomalous differences. 
This may allow the subtraction between F+ and F- to remove some of the 
systematic differences there may be between crystal forms.


Bart

Kay Diederichs wrote:

hari jayaram schrieb:
...

I was wondering if anyone could comment on combining datasets from 
multiple P1 crystals to increase the redundancy even further for such 
heavy atom ( SAS / SAD ) or MAD experiments.




Hari,

well, my comment would be that it should be possible in principle from 
what you describe, but the outcome strongly depends on the details (size 
of expected and observed anomalous and isomorphous signal, internal 
anomalous correlation coefficients, I/sigma and R-factors, radiation 
damage, are crystals isomorphous, ...).


To increase the quality of the reduced data it would be advisable to 
rotate around different axes, which is possible at some - but not all - 
beamlines. This is even more true in P1.


For all of the major data reduction programs there exist specific 
programs for merging data, and it does make a lot of sense to merge your 
passes (but don't merge radiation-damaged data with undamaged data)!. I 
would suggest to use at least two different data reduction packages - 
everything depends on the quality of the data reduction, and the 
programs have strengths in different areas.


HTH,

Kay



--

======

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] Using multiple crystals for structure solution in P1 using MAD/SAS/SAD

2008-07-23 Thread Bart Hazes


Jacob Keller wrote:
Shouldn't all of the "crystal-to-crystal" differences be taken out 
automatically by scaling,


Scaling only takes out differences in overall scale, B-factor and, if 
you have enough data, it can correct to some extend for absorption or 
other more local effects. Systematic differences due to slight 
differences in unit cell, molecular packing etc lead to different 
relative intensities that are not removed by scaling.


and is there not the same proportional 
anomalous signal in every isomorphous crystal, regardless of the 
background? I would think that using multiple crystals would give a 
better idea of "the truth," as if taking many snapshots of the same 
object, and putting them together to form a three-dimensional object. In 
Hazes' language, don't all isomorphous crystals "draw from the same 
[underlying] distribution?"


The answer is yes when the crystals are truly isomorphous. In reality 
they rarely if ever are. The differences tend to be small enough that 
you normally don't have to worry about it for heavy atom derivatives or 
native data sets. However, for weak anomalous signals it is a different 
story.


Bart


Jacob Keller

ps admittedly if there is radiation damage or other non-isomorphisms, 
this reasoning does not apply.


***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: [EMAIL PROTECTED]
*******

- Original Message - From: "Bart Hazes" <[EMAIL PROTECTED]>
To: 
Sent: Wednesday, July 23, 2008 10:05 AM
Subject: Re: [ccp4bb] Using multiple crystals for structure solution in 
P1 using MAD/SAS/SAD



Increasing redundancy only helps if all data draw from the same 
distribution so you get a more accurate estimate of the mean of the 
distribution. When dealing with different crystals, crystal-to-crystal 
variation is likely larger than the anomalous signal you are looking 
for and I'm therefore not convinced that merging of data is a good 
idea (never hurts to try though).


I wonder if it would work better to derive anomalous differences for 
the individual data sets first and then merge those anomalous 
differences. This may allow the subtraction between F+ and F- to 
remove some of the systematic differences there may be between crystal 
forms.


Bart

Kay Diederichs wrote:


hari jayaram schrieb:
...

I was wondering if anyone could comment on combining datasets from 
multiple P1 crystals to increase the redundancy even further for 
such heavy atom ( SAS / SAD ) or MAD experiments.




Hari,

well, my comment would be that it should be possible in principle 
from what you describe, but the outcome strongly depends on the 
details (size of expected and observed anomalous and isomorphous 
signal, internal anomalous correlation coefficients, I/sigma and 
R-factors, radiation damage, are crystals isomorphous, ...).


To increase the quality of the reduced data it would be advisable to 
rotate around different axes, which is possible at some - but not all 
- beamlines. This is even more true in P1.


For all of the major data reduction programs there exist specific 
programs for merging data, and it does make a lot of sense to merge 
your passes (but don't merge radiation-damaged data with undamaged 
data)!. I would suggest to use at least two different data reduction 
packages - everything depends on the quality of the data reduction, 
and the programs have strengths in different areas.


HTH,

Kay




--

====== 



Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

== 









--

==

Bart Hazes (Assistant Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

==

Re: [ccp4bb] truncate ignorance

2008-09-08 Thread Bart Hazes

e I don't quite get exactly what it is
doing with my data and what its assumptions are.

From the documentation:

... the "truncate" procedure (keyword TRUNCATE YES, the
default) calculates a best estimate of F from I, sd(I), and
the distribution of intensities in resolution shells (see
below). This has the effect of forcing all negative
observations to be positive, and inflating the weakest
reflections (less than about 3 sd), because an observation
significantly smaller than the average intensity is likely
to be underestimated.
=

But is it really true, with data from nice modern detectors,
that the weaklings are underestimated?

It isn't really an issue of the detector per se, although in
principle you could worry about non-linear response to the
input rate of arriving photons.

In practice the issue, now as it was in 1977 (French&Wilson),
arises from the background estimation, profile fitting, and
rescaling that are applied to the individual pixel contents
before they are bundled up into a nice "Iobs".

I will try to restate the original French & Wilson argument,
avoiding the terminology of maximum likelihood and

Bayesian statistics.

1) We know the true intensity cannot be negative.
2) The existence of Iobs<0 reflections in the data set means
that whatever we are doing is producing some values of
Iobs that are too low.
3) Assuming that all weak-ish reflections are being processed
equivalently, then whatever we doing wrong for reflections with
Iobs near zero on the negative side surely is also going wrong
for their neighbors that happen to be near Iobs=0 on the positive
side.
4) So if we "correct" the values of Iobs that went negative, for
consistency we should also correct the values that are nearly
the same but didn't quite tip over into the negative range.

Do I really want to inflate them?

Yes.

Exactly what assumptions is it making about the expected
distributions?

Primarily that
1) The histogram of true Iobs is smooth
2) No true Iobs are negative

How compatible are those assumptions with serious anisotropy
and the wierd Wilson plots that nucleic acids give?

Not relevant

Note the original 1978 French and Wilson paper says:
"It is nevertheless important to validate this agreement for
each set of data independently, as the presence of atoms in
special positions or the existence of noncrystallographic
elements of symmetry (or pseudosymmetry) may abrogate the
application of these prior beliefs for some crystal
structures."

It is true that such things matter when you get down to the
nitty-gritty details of what to use as the "expected distribution".
But *all* plausible expected distributions will be non-negative
and smooth.

Please help truncate my ignorance ...

Phoebe

==
Phoebe A. Rice
Assoc. Prof., Dept. of Biochemistry & Molecular Biology
The University of Chicago
phone 773 834 1723

http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01
_Faculty_Alphabetically.php?faculty_id=123

RNA is really nifty
DNA is over fifty
We have put them
both in one book
Please do take a
really good look
http://www.rsc.org/shop/books/2008/9780854042722.asp

--
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742

Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [EMAIL PROTECTED] and destroy all copies of the message and any attached documents.
Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof.

Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park,
Cambridge CB4 0QA under number 3751674

Re: [ccp4bb] Fobs - Fobs

2009-01-29 Thread Bart Hazes


Hi Rana,

You probably have multiple options suggested to you. One is sftools 
using the CALC command. If the subtraction includes a phase then sftools 
can also do the calculation on the full structure factor.



Plain subtraction of amplitudes ensuring the result is >= 0

READ yourfile.mtz
CALC col  fnew = col f1 col f2 - abs
WRITE yournewfile.mtz

Subtraction of structure factors

CALC (col fnew pnew) = (col f1 p1) (col f2 p1) -

f1, p1 etc are the column labels for amplitude and phase columns 
respectively.


I recommend to run the program interactively and you can use CALC help 
to get more help.


Bart

Rana Refaey wrote:

Hi,

I was wondering if anyone knows what programme I need to use to 
subtract the Fobs of two different crystals from each other.


Regards,
Rana


Invite your mail contacts to join your friends list with Windows Live 
Spaces. It's easy! Try it!

Re: [ccp4bb] protein folds

2009-02-25 Thread Bart Hazes

Must be even smaller than Daresbury then. They don't even have a 
synchrotron!


Bart

James Holton wrote:

Paul Emsley wrote:

Here's an experiment:

Find a blindfold and put it on.  Oh, but before you do that, take a
map of England and place it on a dartboard.

Now take 56066 darts and throw them at the map on the board.

Take off the blindfold and investigate where the darts hit.  Did you
hit all the towns and cities? You hit London, Birmingham and Leeds
almost certainly.  But what about Brighton, did you get that?  How
about Clitheroe, was that hit? (Does Clitheroe count anyway or is
that just another part of Preston?)

I hope that that provides you some with insight.

Paul.


I discovered recently that there is a little town/village called 
"Holton" just outside of Oxford.  I doubt it would have been hit by 
one of Paul's darts as I, myself, was completely unaware of its 
existence until I was looking at a "local" map of Oxford only last 
month.  It is an amazingly little place.  Only one road, one church 
and a village hall with my name on it.  Perhaps in the grand scheme of 
things, "Holton" is not worthy of much note, but it was still very 
interesting to me.


-James Holton
MAD Scientist

Re: [ccp4bb] .phs file conversion

2009-03-04 Thread Bart Hazes

SFTOOLS should read the phs file and allow you to write it out in a 
number of different formats, including MTZ.


From the command line type:

sftools
read yourfile.phs
write yourfile.mtz
quit

The program will ask a bunch of questions to get space group, unit cell etc.

Bart

John Bruning wrote:

Hi,
 
I have a .phs file with map coefficients that I would like to open in 
pymol.  So, I would like to convert the file to either a ccp4 or cns 
map file, or a file format that pymol will recognize.  I do not have 
an .mtz file with the same map coefficients included.  Can anyone help me?
 
 
Thanks,

John

Re: [ccp4bb] Calculation of angle between two helix of different subunits

2009-04-16 Thread Bart Hazes


Leiman Petr wrote:

Every other week this question comes up!

This is Geometry 101 or beginner's geometry!!!
http://www.euclideanspace.com/maths/algebra/vectors/angleBetween/index.htm

I am not sure if it possible to understand _anything_ in crystallography if it is not clear how to calculate an angle between two vectors! 


Honestly,

Pet
I don't think many job interviews for crystallographers these days 
include a question on how to calculate the angle between two vectors. 
Sure a background in math, physics, or computation is a valuable asset, 
but it is no longer essential as it was 2-3 decades ago. 
Crystallographers may have dumbed down in the math department but they 
have smartened up in other areas.


Bart

Re: [ccp4bb] H3 to > 2.0A but low observations:parameter ratio

2009-04-23 Thread Bart Hazes


Hi Francis,

The asymmetric unit volume is approximately proportional to the number 
of atoms in your model, the basis for Vm, with some variation due to 
solvent content. In turn the number of unique observations at a given 
resolution is proportional to asymmetric unit volume. So twice the 
number of atoms means twice the volume and twice the number of 
observations. Consequently, the observation/parameter ratio depends 
mostly on resolution and at 2 Angstrom you should have a much better 
ratio than the one you calculated, unless you have a very low solvent 
content, very low data completeness, or, hopefully, you made an error in 
your calculation :)


Bart

Francis E Reyes wrote:

All

It seems I have a case where I have 5595 reflections but my protein is 
about 102 residues. With a mean atom / residue * 4 parameters for each 
atom I get about 7833 parameters. So it seems that I have a 
observation : parameter ratio < 1. There is only 1 molecular per asu 
so there's no hope in using NCS. Phases to be obtained from a pretty 
good MR solution. Does anyone have any 
suggestions/protocols/references for refining with obs:parameter < 1 ?


thanks
FR

-
Francis Reyes M.Sc.
215 UCB
University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC 686F 78FD 6669 67BA 8D5D

Re: [ccp4bb] Software for predicting potential disulfide cross-linking sites?

2009-09-30 Thread Bart Hazes






Peter is correct, SSBOND was written exactly for that purpose. You can
use the form (http://129.128.75.76/forms/ssbond.html) to upload your
PDB file and get the log file back as a webpage. For multi-chain PDBs
it is best to upload a file with just one chain, unless you are
interested inter-chain disulfide bonds.
The form page also allows you to pick up the source code which should
compile with default settings on standard fortran compilers.

Bart

Peter Zwart wrote:

  I remember this question (and the answer) from 11 years ago!

http://www.ysbl.york.ac.uk/ccp4bb/1998/msg00051.html

HTH

Peter



2009/9/30 Jie Liu :
  
  
Dear All

Does anyone know a software or web-server which can predict potential
disulfide cross-linking sites?

I have solved a crystal structure. Is there a software to read in the
coordinates
file and symmetry information and predict the potential contacting
residues
which I can then mutate to CYS in order to introduce an intermolecular
disulfide bond to stabilize the biological assembly?

Your input is greatly appreciated!

Jie Liu


  
  


  


-- 



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] how to improve Rfree?

2009-10-19 Thread Bart Hazes

The RFREE SHELL command in sftools is another way to select thin shells. 
However, for the purist there is no clean way of getting rid of 
correlations between the Rwork and Rfree sets that I know of.


To do it properly, the thickness of the thin shell should be larger than 
the radius where the G-function has strong features. In practice that 
makes the shells so thick that you can have only a few which likely 
causes artifacts. Even then, the reflections at the edge of the shells 
are still contaminated so you'd have to exclude the ones near the edge 
from both the Rwork and Rfree sets. I'm pretty sure that there has been 
a publication long ago on thick shells (It may have been Ian, Phil or 
some other CCP4BB regular).


An improvement would be to go from shells to donuts, basically the 
intersection of the spherical shell and a plane through the origin 
perpendicular to the NCS rotational symmetry axis. One step further is 
Fred's suggestion to use the explicit NCS relationships to select groups 
of NCS-contaminated reflections, but only works if the NCS symmetry 
forms a closed group. One step further again would be to use an explicit 
low resolution G-function model to find the strongest NCS correlations 
and only group those together.


Enough fodder for purists to fight over but when sftools users ask me, 
my general response includes the fact that the higher your NCS the 
greater the NCS-contamination problem, but also the smaller the model 
bias/overfitting problem, assuming NCS restraints are used appropriately 
in refinement. As a reviewer I would have a problem with authors patting 
their backs for a small Rwork-Rfree difference without noting the impact 
of NCS. But if they report a low Rwork-Rfree difference and comment on 
the fact that the difference is likely underestimated due to NCS effects 
I have no problems, with or without thin shells, as long as I feel the 
refinement protocol and final model are acceptable for the biological 
interpretations that they make.


Bart

Vellieux Frederic wrote:

Hi Ian (& ccp4bb'ers),

NCS ties reflections in reciprocal space by the interference 
G-function effect. Nothing more. So you get an R-free value that is 
lower than if you don't have NCS. One should be aware of that, and 
referees should be aware of that.


I currently have a structure that has 12-fold NCS in the asymmetric 
unit. The free R-factor is lower than the R-factor. I expect that 
future referees will not view that kindly.


A number of people have suggested to use different approaches to get 
rid of this reciprocal space binding effect. One of these people (Bart 
Hazes I think, correct me if I'm wrong) suggests to take reflections 
for the R-free as thin shells in reciprocal space. The thin shell is 
omitted completely from the target for refinement (I suppose omitting 
a shell of data completely will also have deleterious effects on the 
refinement, I don't know by how much). Problem is, none of the data 
processing programs or suites of programs has implemented this as far 
as I know. A better approach would be to use the NCS operator (the 
transpose and the inverse of the rotation matrices in fact, including 
the identity matrix for all cases including the cases where the only 
NCS is 1-fold NCS, i.e. the presence of solvent in the asymmetric unit 
or unit cell) to select the subset of reflections that are going to be 
omitted from the refinement target: take one reflection, select all 
equivalents that are bound to it by the interference effect, and 
repeat the process until you have reached the required number of 
reflections to be omitted. But this requires serious programming... 
And someone willing to modify all data processing suites to include 
this approach. But that would satisfy referees because it is the only 
approach that is valid.


Fred.

Ian Tickle wrote:

The problem is that real life is never simple! -
and NCS really messes things up!

Cheers

-- Ian
  


--

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] R-sym and R-merge

2010-01-21 Thread Bart Hazes

1335.1.126.50
32.14
3.56 - 3.38  0.141 - 0.14812270   1186   958.0.930.64
37.29
3.38 - 3.23  0.148 - 0.15512071   1438   661.0.636.79
44.73
3.23 - 3.11  0.155 - 0.16111613   1619   463.0.543.91
53.68
3.11 - 3.00  0.161 - 0.16710870   1708   376.0.451.01
61.37
- - - - - -  -- -----  -   ---
---
* 30.0 - 3.00  total : 123599   9014  2674.1.926.11%
22.61%
 -- --
===
   4515 non-pos.  with  9771 measurements omitted from this table.

Table of </sigma()> for mean values on output file:
 
Resolution /0//+//-/
sig<0
-[A]  -  -  -  -
-
30.0 - 6.4641217.7 4814.535515.83341
5.7  0
6.46 - 5.1341484.6 3162.537973.53708
3.2  0
5.13 - 4.4841435.0 2642.638383.83735
3.5  0
4.48 - 4.0740974.3 2342.138353.23710
3.0  0
4.07 - 3.7841173.0 2151.738632.23717
2.1  0
3.78 - 3.5641482.2 2131.639151.63744
1.5  0
3.56 - 3.3841431.6 1991.339381.23812
1.1  0
3.38 - 3.2340981.2 1920.938960.83840
0.8  0
3.23 - 3.1141400.8 1870.539460.63851
0.6  0
3.11 - 3.0040400.6 1750.438530.43791
0.4  0
- - - - - -  -  -  -  -
-
* 30.0 - 3.00   411953.124762.2   384322.3   37249
2.1  0
 -  -  -  -
-
Below specified lower-resolution limit :
inf. - 30.0   1   -1.0   0...   1   -0.2   1
-1.0  0
Beyond specified high-resolution limit :
3.00 - 2.99 1220.6  100.6 1120.4 111
0.4  0

Multiplicity-independent and sigma-weighted R-factors :
  PCV = "pooled coefficient of variation"
  R_rim = "redundancy-independent merging R-factor" = R_meas.
  R_pim = "precision-indicating merging R-factor"
  Rw_xxx = sigma-weighted R-factors, where :
 = sum{ w * I }  with  sum{ w } = 1
.. defined as:
  PCV = sum{ SQRT[ sum{ |Ij - |**2 }/(n-1) ] } / sum{  }
 h  j   h
  R_rim = sum{ SQRT[n/(n-1)] * sum{ |I - | } } / sum{ I }
  R_pim = sum{ SQRT[1/(n-1)] * sum{ |I - | } } / sum{ I } )
  Rw_rim = sum{ SQRT[n/(n-1)] * sum{ w * |I - | } } / sum{}
  Rw_pim = sum{ SQRT[1/(n-1)] * sum{ w * |I - | } } / sum{}
  Rw_squ = SQRT[ sum{ w * |I - |**2 } / sum{**2} ]
  Rw_lin =  sum{ w * |I - | } / sum{}

Resolution   /+//-/ PCVR_rim   R_pim   Rw_rim  Rw_pim  Rw_squ.
Rw_lin.
-[A]   --[%]-- --[%]-- --[%]--  --[%]-- --[%]-- --[%]--
--[%]--
30.0 - 6.46573425.53   24.98   16.9913.739.41   11.97
9.96
6.46 - 5.13614325.77   25.31   17.1822.87   15.69   15.44
16.57
5.13 - 4.48590326.43   25.94   17.6921.78   14.98   15.39
15.75
4.48 - 4.07580029.69   29.07   19.7825.73   17.66   18.16
18.65
4.07 - 3.78581036.85   36.38   24.6932.42   22.21   20.15
23.54
3.78 - 3.56576844.37   43.66   29.3740.40   27.53   25.67
29.44
3.56 - 3.38547551.15   50.32   33.5646.70   31.61   30.79
34.22
3.38 - 3.23537561.60   60.34   40.2356.21   38.02   37.98
41.20
3.23 - 3.11517573.50   72.31   48.1066.85   45.19   46.53
49.03
3.11 - 3.00483383.97   82.72   55.0777.06   52.10   57.35
56.51
   -  --- --- ---  --- --- ---
---
*   total : 5601631.39%  30.88%  20.91%   24.72%  16.90%  13.95%
17.97%

  
  
  


-- 



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Why Do Phases Dominate?

2010-03-18 Thread Bart Hazes


Hi Jacob here is another stab at it. Not as forme

When you create an electron density map you can think of each amplitude 
F and phase P as forming a vector Vmap that consists of two sub-vectors: 
one that represents the true vector Vtrue and the other representing the 
vector, Vdiff, that connects Vtrue to Vmap. Just try to picture this.


You can now think of calculating three maps based on Vtrue, Vdiff, and 
Vmap. Since FFTs are additive you can consider the map you would 
normally calculate, Vmap, as being the sum of the two others; Vtrue 
being reality and Vmap being noise.


If you have a phase error of 60 degrees Vdiff will actually already be 
of the same magnitude as Vtrue. If you have random phases you will, on 
average, be 90 degrees off, and Vdiff will be 1.41 times as large as 
Vtrue (sqrt(2)). Even relatively small phase errors give significant 
Vdiff/Vtrue ratios (2*sin(half-the-phase-error) if I'm right)


You mentioned "Amplitudes as numbers presumably carry at least as much 
information as phases, or perhaps even more, as phases are limited to 
360deg, whereas amplitudes can be anything."


But in reality amplitudes cannot be anything since they follow a 
Wilson's distribution which has the bulk of the amplitudes cluster near 
the peak of the distribution. The real mathematicians can probably tell 
you the expected amplitude error in such a scenario but that would 
certainly go beyond an intuitive explanation and my own math skills. If 
you look at experimental errors in the amplitudes it is even much less. 
Rmerge tends to be in the 3-10% range and that is on intensities, it 
will be considerable less on amplitudes.


Bart


Jacob Keller wrote:

Dear Crystallographers,

I have seen many demonstrations of the primacy of phase information 
for determining the outcome of fourier syntheses, but have not been 
able to understand intuitively why this is so. Amplitudes as numbers 
presumably carry at least as much information as phases, or perhaps 
even more, as phases are limited to 360deg, whereas amplitudes can be 
anything. Does anybody have a good way to understand this?


One possible answer is "it is the nature of the Fourier Synthesis to 
emphasize phases." (Which is a pretty unsatisfying answer). But, could 
there be an alternative summation which emphasizes amplitudes? If so, 
that might be handy in our field, where we measure amplitudes...


Regards,

Jacob Keller

***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***



--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] odd request: add phase error linearly with resolution

2010-03-19 Thread Bart Hazes


Hi Francis,

Check out the CALC command in sftools. It allows you to apply quite a 
number of mathematical operations on MTZ column data, including phases. 
It also has built-in funtions to return the resolution of reflections 
which you can use in your calculation. CALC HELP should explain how to 
use it.


Bart

Francis E Reyes wrote:

Hi all

I'd like to add a phase error to my PHIB's and FOM's (experimental phases) that 
increases linearly with higher resolution.. it's akin to taking good phases and 
making them bad. Any approaches on how this can be done?
Thanks
FR

-
Francis Reyes M.Sc.
215 UCB
University of Colorado at Boulder

gpg --keyserver pgp.mit.edu --recv-keys 67BA8D5D

8AE2 F2F4 90F7 9640 28BC  686F 78FD 6669 67BA 8D5D

  


--

====

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] Why Do Phases Dominate?

2010-03-22 Thread Bart Hazes

biguity, and the word multiple is stressed here).
>>> You need at least 2 heavy atom derivatives.
>>> This is equivalent to a sampling
>>> of space with double the frequency as required by
>>> Nyquist-Shannon's theorem.
>>>
>>> Modern approaches use exclusively amplitudes to determine
>>> phase. You either have to go to very high resolution
>>> or OVERSAMPLE. Oversampling is not possible with
>>> crystals, but oversampled data exist at very low
>>> resolution (in the nm-microm-range). But
>>> these data clearly show, that also amplitudes carry
>>> phase information once the Nyquist-Shannon theorem
>>> is fulfilled (hence when the amplitudes are oversampled).
>>>
>>> Best
>>> Marius
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr.habil. Marius Schmidt
>>> Asst. Professor
>>> University of Wisconsin-Milwaukee
>>> Department of Physics Room 454
>>> 1900 E. Kenwood Blvd.
>>> Milwaukee, WI 53211
>>>
>>> phone: +1-414-229-4338
>>> email: m-schm...@uwm.edu
>>> http://users.physik.tu-muenchen.de/marius/
>>

Dr.habil. Marius Schmidt
Asst. Professor
University of Wisconsin-Milwaukee
Department of Physics Room 454
1900 E. Kenwood Blvd.
Milwaukee, WI 53211

phone: +1-414-229-4338
email: m-schm...@uwm.edu
http://users.physik.tu-muenchen.de/marius/


--

===
* *
* Gerard Bricogne g...@globalphasing.com  *
* *
* Global Phasing Ltd. *
* Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
* Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
* *
=== 




--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology & Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

Re: [ccp4bb] some questions

2010-05-10 Thread Bart Hazes


On 10-05-10 08:03 AM, Sudhir Kumar wrote:

hi all
this is just a basic query or rather for discussion.
1. What maintains the active state of the protein during
crystallization under different condition which is altogether
different condition from what that protein might be in  vivo?
   
The same forces that maintain a protein in solution also maintain it in 
the crystal. Protein-protein interactions that keep the crystal together 
are rather weak and tend to only/mostly affect parts of the protein that 
are already flexible. In other words, if a part of a protein adopts 
multiple conformations, a crystal interactions may stabilize one of 
them. You loose the information about the flexibility but the structure 
you get is still one of the "natural conformations. Different solution 
conditions can affect structure and you can find examples of pH induced 
changes and probably others. But again experience shows that proteins 
retain their structures through a wide range of conditions, or perhaps 
conditions that mess up the structure simply never crystallize.

2. what is the probability of a nonfunctinal state of a protein
getting crystallize?
   
Proteins that have a nonfunctional state in solution because they need 
to be activated, proteolytically cleaved, etc. can be crystallized in 
that state. Proteins that occur in open and closed states may be pushed 
to the closed form by the precipitating agents and if this may 
correspond to an inactive state. One of my projects had crystals of a 
hemocyanin grown at high NaCl concentrations. Cl- is a known allosteric 
inhibitor and it locked the protein in the low affinity state. Another 
project in the lab involved a thioredoxin where the active site cysteine 
became inactive through arsenylation due to the use of cacodylate 
(dimethyl-arsenate). So if you search for it there are examples, but in 
many cases the inactive form still represents one of the physiological 
relevant forms.

3. Is crystal structure the actual structure of the macromolecule or
is it rather near-actual structure?
   
Perhaps the better way to look at it is that proteins do not have one 
"actual structure". They are flexible molecules that can adopt a 
multitude of slightly, and sometimes not so slightly, different 
structures. The core features tend to be well defined but the atomic 
motions that do occur are significant compared to the atomic positional 
errors of crystal structures.

  i apologize if just in case this question is not upto level of discussion.
thanks
   
no problems, I fear our nice atom model images tend to make people 
forget that proteins are not static.


Bart

--

========

Bart Hazes (Associate Professor)
Dept. of Medical Microbiology&  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521

86 matches

Mail list logo