Here's a simple example showing the enumeration of a 3 component library based on a reaction https://gist.github.com/PatWalters/7439099598b4f08a331a81b209f88baa
On Wed, Jul 6, 2022 at 4:57 PM Andrew Dalke <da...@dalkescientific.com> wrote: > Hi Carsten, > > How are the fragments expressed? With attachment points marked with > "[*:1]", "[*:2]" and "[*:3]" atoms? > > One technique is to rewrite the SMILES to use closures. (See > https://onlinelibrary.wiley.com/doi/10.1002/qsar.200310008 or > http://www.dalkescientific.com/writings/diary/archive/2005/05/07/attachment_points.html > ). > > For example, if your core SMILES are: > > [*:1]c1ncc([*:2])cn1 > CC([*:2])O[*:1] > > and your R1 contains > > *F > Cl* > Br* > > and your R2 contains > > *CCO > CO* > > then you could rewrite these to use "%91" to connect the [*:1] with the R1 > "*" and use "%92" to connect the [*:2] with the R2 "*", using > dot-disconnected terms. > > For example: > > [*:1]c1ncc([*:2])cn1 + *F + *CCO > > can be rewritten as > > c%911ncc%92cn1.F%91.C%92CO > > which is parsed and canonicalized to: > > OCCc1cnc(F)nc1 > > Rewriting the SMILES this way is a bit tricky. I've attached a program > which does it for you. > > > Running it on the above gives: > > % cat core.smi > [*:1]c1ncc([*:2])cn1 > CC([*:2])N[*:1] > > % cat r1.smi > *F > Cl* > Br* > > % cat r2.smi > *CCO > CO* > > % python enumerate.py --R1 r1.smi --R2 r2.smi core.smi > c1%91ncc%92cn1.F%91.C%92CO -> OCCc1cnc(F)nc1 > c1%91ncc%92cn1.F%91.CO%92 -> COc1cnc(F)nc1 > c1%91ncc%92cn1.Cl%91.C%92CO -> OCCc1cnc(Cl)nc1 > c1%91ncc%92cn1.Cl%91.CO%92 -> COc1cnc(Cl)nc1 > c1%91ncc%92cn1.Br%91.C%92CO -> OCCc1cnc(Br)nc1 > c1%91ncc%92cn1.Br%91.CO%92 -> COc1cnc(Br)nc1 > CC(O%91)%92.F%91.C%92CO -> CC(CCO)OF > CC(O%91)%92.F%91.CO%92 -> COC(C)OF > CC(O%91)%92.Cl%91.C%92CO -> CC(CCO)OCl > CC(O%91)%92.Cl%91.CO%92 -> COC(C)OCl > CC(O%91)%92.Br%91.C%92CO -> CC(CCO)OBr > CC(O%91)%92.Br%91.CO%92 -> COC(C)OBr > > It also supports --R3 if your core has 3 R-groups, with the third core > point labeled [*:3]. > > Best regards > > > Andrew > da...@dalkescientific.com > > > > > > > On Jul 6, 2022, at 21:00, Carsten Bauer <carsten.ba...@bluewin.ch> > wrote: > > > > Hello > > > > I have a structure with three substituents R1, R2 and R3 > > R1 is an enumeration of 30+ SMILES > > R2 and R3 each is an enumeration of <5 SMILES > > Chemical space = 30 x 5 x 5 = 750+ in-silico compounds > > > > Can anyone share (i.e publish in a citable form) an RDKit code for this > permutation? > > Is there a textbook example illustrating this daily question from the > lab in an example, please? > > > > I can’t follow > > https://www.rdkit.org/docs/cppapi/EnumerationStrategyBase_8h_source.html > > > > Sorry. > > > > Many thanks for getting back. > > Kindest regards > > C. > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss