Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

Paolo Tosco Tue, 19 May 2020 09:33:18 -0700

Hi Theo,

I don't think the RDKit version should make a difference; did you noticethat rdmolops.AdjustQueryProperties() does not modify the molecule inplace, but rather returns a modified copy?


pattern_generic_bonds  =  Chem.AdjustQueryProperties(pattern,  query_params)

That might be the reason. Also, only pattern_generic_bonds will haveUNSPECIFIED bonds, the mols will still have SINGLE and DOUBLE bonds.


Feel free to contact me off-list if you need help with the above.

Cheers,
p.

On 19/05/2020 17:01, theozh wrote:

Hi Paolo,

thank you very much for your detailed answer.
I tried to reproduce your last suggestion (but I don't have Jupyter Notebook).
However, my bonds are still SINGLE and DOUBLE instead of UNSPECIFIED.
Does this maybe depend on the RDKit Version, I have 2019.03... ?

Maybe, I should update and need to investigate further.
Theo.


Am 19.05.2020 um 16:44 schrieb Paolo Tosco:

Hi Theo,

the lack of match is due to different aromaticity flags on atoms and bonds in 
the larger molecule.

This gist provides some explanation and a possible solution:

https://gist.github.com/ptosco/e410e45278b94e8f047ff224193d7788

Cheers,
p.

On 19/05/2020 14:13, theozh wrote:

Dear RDKit-users,

I would like to do a very simple substructure search.
The chapter 3.5 "Substructure Searching" in RDKit Documentation (2019.09.1) is 
pretty short and doesn't point to a solution. So far, I've learned that you can create 
your search pattern via Chem.MolFromSmiles() or Chem.MolFromSmarts().

In the below copy&paste minimal example, I want to use the first SMILES in the 
list as search pattern. I expect 2 matches but I get either 1 or 0 matches. So, I'm 
doing something wrong. What am I missing?
Is it about implicit/explicit aromatic and aliphatic bonds or some 
explicit/implicit hydrogen?
How to find the first structure in both SMILES?

thank you for any hints,
Theo.

### simple substructure search (but doesn't find what is expected)
from rdkit import Chem

smiles_strings = '''
C12=CC=CN1NCCC2
C12=CC=CC(C=C3)=C1N3NCC2
'''
smiles_list = smiles_strings.splitlines()[1:]
print(smiles_list)

pattern = Chem.MolFromSmiles(smiles_list[0])  # MolFromSmiles
matches = [x for x in smiles_list if 
Chem.MolFromSmiles(x).HasSubstructMatch(pattern)]
print(len(matches))   # result: 1, why not 2?

pattern = Chem.MolFromSmarts(smiles_list[0])  # MolFromSmarts
matches = [x for x in smiles_list if 
Chem.MolFromSmiles(x).HasSubstructMatch(pattern)]
print(len(matches))   # result: 0, why not 2?
### end of code


_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

Reply via email to