[Replying to the list since this may be of general interest]
On Wed, Jun 8, 2011 at 10:00 AM, James Davidson wrote:
>
> Apologies for the confusion - I will try to explain by example:
>
from rdkit import Chem
from rdkit.Chem import AllChem
rsmarts="[C;X4;!H0:1][a:2]>>[*:1](Br)[*:2]"
rxn=AllChem.ReactionFromSmarts(rsmarts)
print AllChem.ReactionToRxnBlock(rxn)
> $RXN
>
> RDKit
>
> 1 1
> $MOL
>
> RDKit
>
> 2 1 0 0 0 0 0 0 0 0999 V2000
> 0. 0. 0. * 0 0 0 0 0 0 0 0 0 1 0 0
> 0. 0. 0. * 0 0 0 0 0 0 0 0 0 2 0 0
> 1 2 1 0
> V 1 [C&X4&!H0:1]
> V 2 [a:2]
> M END
> $MOL
>
> RDKit
>
> 3 2 0 0 0 0 0 0 0 0999 V2000
> 0. 0. 0. * 0 0 0 0 0 0 0 0 0 1 0 0
> 0. 0. 0. Br 0 0 0 0 0 0 0 0 0 0 0 0
> 0. 0. 0. * 0 0 0 0 0 0 0 0 0 2 0 0
> 2 1 1 0
> 1 3 1 0
> V 1 [*:1]
> V 3 [*:2]
> M END
>
>
>
> So, we have a rxn that can be outputted as an atom-mapped reaction block
> (carrying query features). What I want is the corresponding atom-mapping,
> but applied to the reactant and product of cases where the reactant is a
> match:
>
smiles="c1c1C"
mol=Chem.MolFromSmiles(smiles)
prods = rxn.RunReactants((mol,))
prod = prods[0][0]
>
> So in this case I would like to get at the following (pseudo reaction for
> illustrative purposes):
>
> c1c{1}1C{2} >> c1c{1}1C{2}Br
>
> (ie the 'specific' reaction, but carrying the atom numbers from the template
> - {n} denoting atom-map)
>
> As I said before, I am a bit rusty on this - so I may very well be missing
> something obvious!
This isn't super straightforward, but it is do-able. I'll show it for
this specific example, if you have problems generalizing, we can
iterate here.
# First get everything setup as you did above, including getting a product:
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> rsmarts="[C;X4;!H0:1][a:2]>>[*:1](Br)[*:2]"
>>> rxn=AllChem.ReactionFromSmarts(rsmarts)
>>> smiles="c1c1C"
>>> mol=Chem.MolFromSmiles(smiles)
>>> prods = rxn.RunReactants((mol,))
>>> prod = prods[0][0]
>>> Chem.SanitizeMol(prod)
# Now get copies of the reactant and product templates from the reaction:
>>> rtmpl = rxn.GetReactantTemplate(0)
>>> ptmpl = rxn.GetProductTemplate(0)
# Find the mapping of the reactant template onto the reactant molecule:
>>> match = mol.GetSubstructMatch(rtmpl)
# copy the atom mapping information from the template to the reactant molecule:
>>> for tmplId,molId in enumerate(match):
...if rtmpl.GetAtomWithIdx(tmplId).HasProp('molAtomMapNumber'):
... at = mol.GetAtomWithIdx(molId)
...
at.SetProp('molAtomMapNumber',rtmpl.GetAtomWithIdx(tmplId).GetProp('molAtomMapNumber'))
... # this is a hack. we need a query associated with an atom to
see the mapping info in SMARTS :
... Chem.AddRecursiveQuery(mol,Chem.MolFromSmarts('*'),molId)
# repeat that process for the product:
>>> match = prod.GetSubstructMatch(ptmpl)
# copy the atom mapping information from the template to the reactant molecule:
>>> for tmplId,molId in enumerate(match):
...if ptmpl.GetAtomWithIdx(tmplId).HasProp('molAtomMapNumber'):
... at = prod.GetAtomWithIdx(molId)
...
at.SetProp('molAtomMapNumber',ptmpl.GetAtomWithIdx(tmplId).GetProp('molAtomMapNumber'))
... # this is a hack. we need a query associated with an atom to
see the mapping info in SMARTS :
... Chem.AddRecursiveQuery(prod,Chem.MolFromSmarts('*'),molId)
# test that the SMARTS for those are ok:
>>> print Chem.MolToSmarts(mol)
# output is: [#6]1:[#6]:[#6]:[#6]:[#6]:[#6&$(*):2]:1-[#6&$(*):1]
>>> print Chem.MolToSmarts(prod)
# output is: [#6&$(*):1](-[#35])-[#6&$(*):2]1:[#6]:[#6]:[#6]:[#6]:[#6]:1
# now build a new reaction:
>>> nrxn = AllChem.ChemicalReaction()
>>> nrxn.AddReactantTemplate(mol)
>>> nrxn.AddProductTemplate(prod)
# we can now get the SMARTS for the reaction:
>>> print AllChem.ReactionToSmarts(nrxn)
# output is:
[#6]1:[#6]:[#6]:[#6]:[#6]:[#6&$(*):2]:1-[#6&$(*):1]>>[#6&$(*):1](-[#35])-[#6&$(*):2]1:[#6]:[#6]:[#6]:[#6]:[#6]:1
# let's test the reaction to make sure it works.
# due to a (already reported) bug in the way atom properties are
handled, nrxn cannot be directly used,
# so we use a hack and reparse it:
>>> nrxn = AllChem.ReactionFromSmarts(AllChem.ReactionToSmarts(nrxn))
>>> nrxn.Validate()
# now we can run a molecule through to make sure it works:
>>> nmol = Chem.MolFromSmiles('c1c1C')
>>> nps = nrxn.RunReactants((nmol,))
>>> print Chem.MolToSmiles(nps[0][0])
# output is: BrCc1c1
Is that what you're looking for?
> PS I should say that I may have made a mistake with this example(?) because
> I kept getting an error when trying to get the MolBlock of the product that
> was rectified if I returned the Smiles first:
>
prod = prods[0][0]
print