Hi Jenke,
The reason why the resulting molecule contains no atoms is that the
3,5-dichlorophenyl substituent matches twice the 3-chlorophenyl
substructure. If you look at the deleteSubstructs() code in
ChemTransforms.cpp you may see that the atoms deleted from the main
structures are the union of all possible matches (two in this case).
Therefore, both the 3- and the 5-chloro substituents are deleted, and as
all other atoms are part of the MCS, you are left with an empty molecule.
If you wish to delete matches one at a time, rather than deleting the
union of all matches, you may use the DeleteSubstructs() Python
implementation that I drafted in this gist:
https://gist.github.com/ptosco/1b6bc727ddee32b4d411cf5c2aea7291
(disclaimer: I wrote it very quickly and tested it only with your
example, so I don't guarantee it is bug-free)
This DeleteSubstructs() implementation returns a list of all structures
obtained by deleting the matched substructures one at a time. As I see
you are using 3D structures, the code deletes all orphaned hydrogens and
caps the fragments remaining after deleting substructures with
hydrogens. If you don't care about 3D coordinates you may want to
uniquify results based on canonical SMILES and, for example in the case
of lig12, keep the remaining HCl fragment only once.
Another observation is that the PDB reader implemented in the RDKit
autoconnects atoms based on proximity, but does not guess bond orders,
so your structures are not chemically correct as they are all
single-bonded even if you have carbonyls, aromatic rings, etc.
In my gist I used Flare to perceive bond orders as I was running the
code in the Flare Jupyter Notebook; you may also use other
cheminformatics toolkit for that purpose.
I hope the above helps; please feel free to get back to me off-list if
something is not clear.
Cheers,
p.
On 01/21/19 15:51, SCHEEN Jenke wrote:
Hi all,
I'm trying to remove the MCS between two molecules (attached, 02.pdb
and 12.pdb) using rdFMCS.FindMCS and AllChem.DeleteSubstructs using
the following code:
#########################################
from rdkit import Chem
from rdkit.Chem import AllChem, rdmolfiles, rdFMCS
#########################################
# load molecules:
lig02_pdb = open("02.pdb", 'r').read()
lig12_pdb = open("12.pdb", 'r').read()
lig02_mol = rdmolfiles.MolFromPDBBlock(lig02_pdb)
lig12_mol = rdmolfiles.MolFromPDBBlock(lig12_pdb)
# make list of molecules to map the MCS to:
perturbation_pair = []
perturbation_pair.append(lig02_mol)
perturbation_pair.append(lig12_mol)
MCS_object = rdFMCS.FindMCS(perturbation_pair, completeRingsOnly=True)
MCS_SMARTS = Chem.MolFromSmarts(MCS_object.smartsString)
# remove MCS from each molecule:
lig02_stripped = AllChem.DeleteSubstructs(lig02_mol, MCS_SMARTS)
lig12_stripped = AllChem.DeleteSubstructs(lig12_mol, MCS_SMARTS)
# print SMILES of each stripped molecule:
print("lig02: " + str(Chem.MolToSmiles(lig02_stripped)))
print("lig12: " + str(Chem.MolToSmiles(lig12_stripped)))
#print(Chem.MolToMolBlock(MCS_SMARTS),file=open('./MCS.mol','w+'))
#########################################
I attached the MCS.mol file as well. The lig12.pdb contains an extra
Cl atom, and lig12_stripped should thus contain a single Cl atom after
deletion of the MCS substructure (the MCS substructure is equal to
lig02.pdb). When running the script it actually contains 0 atoms.
I haven't been able to locate the source of the issue in the
AllChem.DeleteSubstructs documentation, does anyone have a suggestion?
I'm using a conda install of rdkit (2018.09.1.0).
Best,
Jenke
--------------------------------------------------------------
University of Edinburgh
David Brewster road
Edinburgh, EH9 3FJ
United Kingdom
-------------------------------------------------------------
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss