Dear Rdkit Community,
I am currently working on developing a drug discovery program that needs to be
able to align similar molecules onto one another. I have been using rdkit to
do this so far, but I have had trouble getting the alignment to work well
consistently across different molecules. Ideally we want to identify the
portions of the molecules that are similar to one another and then align the
molecules such that those portions overlap. It also needs to allow for
sampling of the possible permutations in that alignment where there are
symmetries that can be taken advantage of. I'm wondering if this can be done
with one or two function in rdkit.
I have given three examples below of molecules that we want to be able to align
and how those alignments should work.
1. We want benzene and phenol to map the carbons in the ring but to allow the
placement of the OH group to be equally likely to go onto each of the carbons.
There should be a 1/6 probability that the OH group could be on a given benzene
carbon location.
c--c c--c H
/ -- \ / -- \ /
c | | c ----> c | | c--O
\ -- / \ -- /
c--c c--c
2. We want benzene and napthalene to map onto one another such that the ring of
the benzene overlaps with one of the napthalene rings and vice versa.
c--c c--c
/ -- \ / -- \
c | | c ----> c | | c--c
\ -- / \ -- / -- \
c--c c--c | | c
\ -- /
c--c
3. We want to be able to have ring decorations map onto each other when
possible.
c--c c--c H
/ -- \ / -- \ /
c | | c--CH3 ----> c | | c--O
\ -- / \ -- /
c--c c--c
I should note that I have tried using the GetO3A function in rdkit to align the
molecules and it has given some really bizarre mappings for napthalene to
benzene mapping. I will give some examples of atoms that were mapped to one
another bellow. The mapped atoms are indicated with an '@' sign.
@--@ c--@
/ -- \ / -- \
@ | | @ ----> c | | @--@
\ -- / \ -- / -- \
@--@ @--@ | | c
\ -- /
@--c
@--c @--@
/ -- \ / -- \
@ | | @ ----> c | | @--c
\ -- / \ -- / -- \
@--@ @--@ | | c
\ -- /
c--c
@--c @--@
/ -- \ / -- \
@ | | @ ----> @ | | c--c
\ -- / \ -- / -- \
@--@ @--@ | | c
\ -- /
@--c
I would really appreciate suggestions on how to handle these realignments.
Best Regards,
Lara
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss