On Mar 31, 2021, at 21:55, Ling Chan <lingtrek...@gmail.com> wrote:
> I am trying to do something that I think is quite simple, but I have not 
> figured out a simple way. Don't know if I am missing something. I am sure 
> that ultimately I can figure it out, but I wonder if there is a good way.

If you can work in SMILES space rather than molecule space, then try:

   http://dalkescientific.com/smiles_weld.py

It's derived from a technique I developed for the mmpdb package. I called it 
'welding' the SMILES strings.

What I do is convert the wildcards into closures, then let RDKit merge the 
closures. (There are a few tricky parts, like support for double-bond stereo 
chemistry.)

Here's an example, where I use a dictionary to tell the program that [1*] 
should be bonded to [2*].

>>> from rdkit import Chem
>>> smi = "N#Cc1ccncc1"
>>> mol = Chem.MolFromSmiles(smi)
>>> frag_mol = Chem.FragmentOnBonds(mol, [1])
>>> frag_smi = Chem.MolToSmiles(frag_mol)
>>> frag_smi
'[1*]c1ccncc1.[2*]C#N'
>>> import smiles_weld
>>> smiles_weld.convert_wildcards_to_closures(frag_smi, {1: 1, 2: 1})
'c%991ccncc1.C%99#N'
>>> Chem.CanonSmiles('c%991ccncc1.C%99#N')
'N#Cc1ccncc1'

If you use matching dummy labels then you can omit the conversion table:

>>> frag_mol = Chem.FragmentOnBonds(mol, [1], dummyLabels=((4,4),))
>>> frag_smi = Chem.MolToSmiles(frag_mol)
>>> frag_smi
'[4*]C#N.[4*]c1ccncc1'
>>> smiles_weld.convert_wildcards_to_closures(frag_smi)
'C%99#N.c%991ccncc1'
>>> Chem.CanonSmiles('C%99#N.c%991ccncc1')
'N#Cc1ccncc1'

Note: while the mmpdb code is well-tested, I modified it this morning to handle 
what I think you want, and I haven't fully tested the new code.

The program assumes the SMILES is a canonical SMILES generated by RDKit, and 
that the wildcard labels don't have a charge, hydrogen count, or other 
attribute.


Cheers,
                                Andrew
                                da...@dalkescientific.com




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to