Hi Marcin, Thank you very much for your detailed reply! You pointed out many special cases that I wouldn't have thought of. Indeed an existing solution is highly preferred, and I appreciate you pointing out several available libraries. I came across Gemmi and I will give it another look, and I will examine the additional resources you mentioned as well.
Best regards, Orly Avraham On Sun, Jan 12, 2020 at 5:59 PM Marcin Wojdyr <woj...@gmail.com> wrote: > Hi Zhijie, > > it's good and instructive to implement such things from the ground up, > but there are many special cases that one would be discovering while > testing this procedure, so if the time is limited it may be better to > use an existing solution. > > For instance, here one may find out that using the SCALE1 record > doesn't give the sufficient accuracy. In the example in your script > you have 6 significant digits in the unit cell lengths in CRYST1, but > only 4 significant digits in SCALE1. (The accuracy of SCALE1 is > problematic in general; sometimes it needs to be manually removed when > a program reads it in preference to CRYST1.) > > Then one may find out that the 3x3x3 supercell is not sufficient. If > the molecule is far from the origin, symmetry operations send it far > away. For example, 5M3H annotates (in the mmCIF format) hydrogen bond > between 1_555 and 2_11516 - the symmetry mate is shifted 16 unit cells > in the z direction. Since you already use fractional coordinates in > your script you could tell directly from the center-of-mass > coordinates how many unit cells it should be shifted. Say, you have > x=3.1, so to shift it near the origin you shift it by 3 unit cells > along x. > > But even if all the molecules are shifted near the origin, the 3x3x3 > cell is still not sufficient to find contacts. > See 3NWH – a homo-4-mer in P2 (4 x 2 chains per unit cell). Here it in > its unit cell, colored by the chain id: > https://gemmi.readthedocs.io/en/latest/_images/3nwh.png > Or 5XG2 – a monomer in P21. Two copies of the chain are rainbow-colored > here: > https://gemmi.readthedocs.io/en/latest/_images/5xg2.png > These chains span over more than 4 unit cells in one direction. One > could use big enough supercell, but it'd be slow. I suppose that even > using a 3x3x3 supercell is slow. The alternative is to do the distance > calculation in fractional coordinates modulo 1. > > Then you needs to consider atoms on special positions. If you apply > symmetry operations to an atom on a 4-fold symmetry axis you get 4 > atoms in the same place. So this needs to be handled. The atom may not > be exactly on the axis, because the refinement program may not > constrain its position. So the symmetry operations should produce, I > think, 4 alternative locations of the same atom. But you could also > have an atom near the symmetry axis bonded to its symmetry mate - then > the symmetry operations should produce different atoms. So the > procedure requires a cut-off distance or a heuristics to distinguish > the two cases. > > Then, if you'd like to expand non-crystallographic symmetry from the > MTRIX records - this is another complication. And so on... > > So I'd recommend using one of many available programs for finding > contacts or interactions. If none of them is suitable - then try > crystallographic libraries. > I didn't document yet how to find the contacts using gemmi, but I'll > do it in the coming weeks (or months). Cctbx and clipper are other > (more mature) libraries worth checking. > > Best wishes, > Marcin > > > On Sat, 11 Jan 2020 at 02:11, Zhijie Li <zhijie...@utoronto.ca> wrote: > > > > Hi Orly, > > > > REMARK 290 should be the easiest way for generating symmetry mates. > Other routes are just going to give you the same results. As Jonathan > already pointed out, the symm ops do not garantee that the symm copies are > close to each other. The most simple-minded solution to this problem would > be simply generating 3x3x3 unit cells so that the unit cell in center will > be complete. An upgrade to this is to compute the center of mass of the > symmetry copies in each of the 3x3c3 cells and find which one is closest to > the orignal 1555 copy. Just for fun, I wrote a little python script that > does this (attached). In this script for unit cell translation and > calculating center-center distances, I converted the Cartesian coordinates > to fractional coords first. Then after the translation,I used the inverse > of the SCALE1 matrix to get the shifted Cartesian coords. This way I don't > need to read wikipedia on geometry . But as noted in the script the > distances should better be calculated in Cartesian. > > > > Zhijie > > > > ________________________________ > > From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of orly > avraham <orly.levin...@mail.huji.ac.il> > > Sent: Friday, January 10, 2020 3:30 PM > > To: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK> > > Subject: [ccp4bb] Generating symmetry mates using python > > > > Hi all, > > > > I am a crystallographer currently employing computational methods as > well as experimental crystallography. > > I am trying to generate symmetry mates in python (working with pandas > dataframes), in order to analyze inter-sub-unit interactions. To do so I am > trying to use the info in "REMARK 290 CRYSTALLOGRAPHIC SYMMETRY" and > manually (using numpy) perform a matrix multiplication with the relevant > translation (xyz*rotation + translation). > > For some reason this doesn't work consistently and I feel I need to use > the info in CRYST1 to obtain the unit cell and multiplication matrix. Here > I ran into trouble with extracting the correct symmetry operations based on > each space group. I found spglib but it doesn't quite solve the problem. > > I also tried opening PyMol through the command and generating symmetry > mates this way. It worked on a few files but failed quite quickly > (segmentation fault) and was also very slow. > > Can anyone suggest a useful solution, preferably clear to use and/or > well documented? Or even have a python script/code they can share for this? > > > > Best regards, > > Orly > > > > -- > > > > Orly Avraham, Ph.D. > > Postdoctoral fellow > > The lab of Prof. Oded Livnah > > and the lab of Prof. Ora Schueler-Furman > > The Hebrew University of Jerusalem > > Israel > > > > > > ________________________________ > > > > To unsubscribe from the CCP4BB list, click the following link: > > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > > > > > > ________________________________ > > > > To unsubscribe from the CCP4BB list, click the following link: > > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > > ######################################################################## > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1 > -- Orly Avraham, Ph.D. Postdoctoral fellow The lab of Prof. Oded Livnah and the lab of Prof. Ora Schueler-Furman The Hebrew University of Jerusalem Israel ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1