Dear Quoc-Tuan,
GetSubstructMatches() tries to find isoprene at all positions where this
is possible.
You may want to test your SMARTS and its matching with structures at
this great place:
https://smartsview.zbh.uni-hamburg.de/
Maybe you would prefer to known whether borneol
follows the isoprene rule or not by trying to cover its structure
with two, unbound, isoprene units.
I really would like to know how to write that with SMARTS.
Jean-Marc
Le 04/05/2020 à 10:10, Greenpharma S.A.S. a écrit :
Dear All,
Please could you help with the following problem (I could not find
answers in discussion list) ?
pattern='C~C~C(~C)~C'
smiles='O[C@H]1C[C@H]2C([C@@]1(C)CC2)(C)C'
pat = Chem.MolFromSmiles(pattern)
mol = Chem.MolFromSmiles(smiles)
res = mol.GetSubstructMatches(pat, uniquify=True)
The results are:
((1, 2, 3, 4, 8), (1, 5, 4, 3, 9), (1, 5, 4, 3, 10), (1, 5, 4, 9, 10),
(2, 1, 5, 4, 6), (2, 1, 5, 4, 7), (2, 1, 5, 6, 7), (2, 3, 4, 5, 9),
(2, 3, 4, 5, 10), (2, 3, 4, 9, 10), (3, 4, 5, 1, 6), (3, 4, 5, 1, 7),
(3, 4, 5, 6, 7), (5, 4, 3, 2, 8), (6, 5, 4, 3, 9), (6, 5, 4, 3, 10),
(6, 5, 4, 9, 10), (7, 5, 4, 3, 9), (7, 5, 4, 3, 10), (7, 5, 4, 9, 10),
(7, 8, 3, 2, 4), (8, 3, 4, 5, 9), (8, 3, 4, 5, 10), (8, 3, 4, 9, 10),
(8, 7, 5, 1, 4), (8, 7, 5, 1, 6), (8, 7, 5, 4, 6), (9, 4, 3, 2, 8),
(9, 4, 5, 1, 6), (9, 4, 5, 1, 7), (9, 4, 5, 6, 7), (10, 4, 3, 2, 8),
(10, 4, 5, 1, 6), (10, 4, 5, 1, 7), (10, 4, 5, 6, 7))
I expect to have only 2 matches with uniquify=True as I only have 2
units of the pattern. Furthermore, with or without uniquify, I have
the same answers. I also expected that there should be 2 "independent"
lists but here, there is always at least one common atom between each
list.
Is there something misunderstood or misused?
Thanks in advance for your help and explanations.
Best regards,
Quoc-Tuan
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Jean-Marc Nuzillard
Directeur de Recherches au CNRS
Institut de Chimie Moléculaire de Reims
CNRS UMR 7312
Moulin de la Housse
CPCBAI, Bâtiment 18
BP 1039
51687 REIMS Cedex 2
France
Tel : 03 26 91 82 10
Fax : 03 26 91 31 66
http://www.univ-reims.fr/icmr
http://eos.univ-reims.fr/LSD/CSNteam.html
http://www.univ-reims.fr/LSD/
http://www.univ-reims.fr/LSD/JmnSoft/
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss