[Rdkit-discuss] Library enumeration in multiple positions with no cross permutation

Carsten Bauer Mon, 04 Jul 2022 14:00:20 -0700

Hello

I want to enumerate a simple molecule having 4 substituents R with a list of 
ca. 100 SMILES.
For reasons of simply synthesis, in each enumeration of R, the R should be the 
same in all four positions (no cross permutation).
There is no reaction that covers all 100 SMILES.


I followed https://www.rdkit.org/docs/Cookbook.html#sidechain-core-enumeration 
and modified the code proposed by Earnshaw et al. accordingly:

core = 
Chem.MolFromSmiles('[*]C(C=C1)=CC=C1C(C2=CC=C([*])C=C2)C(C3=CC=C([*])C=C3)C4=CC=C([*])C=C4')
chains = ['C','CC','CCC','CCCC','CCCCC','CCCCCC']
chainMols = [Chem.MolFromSmiles(chain) for chain in chains]

product_smi = []
for chainMol in chainMols:
    product_mol = 
Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#4]'),chainMol)
    product_smi.append(Chem.MolToSmiles(product_mol[0]))
print(product_smi)

which results in 
['*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', 
'*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', 
'*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', 
'*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', 
'*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', 
'*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1’]

This is six times the same compound with no enumeration.

Python beginner here. Can anybody tell me what the mistake is or where I can 
find an example in the literature, please?

Many thanks
C.

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] Library enumeration in multiple positions with no cross permutation

Reply via email to