Dear James,
On Thu, Jun 3, 2010 at 7:51 PM, James Davidson wrote:
>
> First of all, I'd like to start by saying how much I've been enjoying
> exploring the functionality of RDKit - great job, Greg!
Thanks!
> I have a couple of questions regarding
> 'rdkit.Chem.AllChem.ReactionFromSmarts':
>
> (1) I see that the reaction objects can be created from MDL Reaction
> Files/Blocks - is there a way to do the reverse, and save a reaction object
> in MDL .rxn format? I tried using investigating the rxn.ToBinary()
> attribute, but didn't get very far... The reason I wanted to do this, is
> that I was trying to figure-out how to generate a form of the reaction
> object (generated from reaction SMARTS) that was suitable for converting
> into a 2D depiction of the transformation.
At the moment the reactions are essentially input-only. There's really
no way to get them out in any format that could be used elsewhere.
This is a sadly missing feature: it would be really nice to be able to
generate either .rxn files (or at least reaction smarts) from
reactions. I will add a feature request for this, but it may take a
while to happen.[1]
A workaround that kind of works is to paste the reaction smarts into
something like Marvin Sketch. It will normally display something that
at least gives some idea of what the reaction is.
> (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some
> behaviour that I would not expect - however, this could be down to my
> SMARTS-naivety; my SMIRKS-naivety; or both!
Anytime reactions behave in ways you don't expect, it's probably best
to just blame me for coming up with yet another way of expressing them
that is slightly incompatible with the existing ones. :-)
> I initially tried the
> following:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
> rxn_smarts =
> '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'
> sm = Chem.MolFromSmiles('CC(=O)NC')
> rxn = AllChem.ReactionFromSmarts(rxn_smarts)
> prods = rxn.RunReactants((sm,))
> prod = Chem.MolToSmiles(prod[0][0])
>
>
> This gives me prod = '[H]C(=O)NC'
There's a discussion of this kind of case in the "RDKit Book"
($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short
answer is that if you have a query feature (atom list, recursive
smarts, etc.) in the reactants and you would like the matching atom to
be copied into the products you should include a dummy for that atom
in the products. A working form of your example is then:
[11]>>> rxn_smarts =
'[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[*:1]-[C:3](=[O:4])-[NH:2]-[*:5]'
[12]>>> rxn = AllChem.ReactionFromSmarts(rxn_smarts)
[13]>>> prods = rxn.RunReactants((Chem.MolFromSmiles('c1c1C(=O)NCC1CC1'),))
[14]>>> Chem.MolToSmiles(prods[0][0])
Out[14] 'O=C(CC1CC1)Nc1c1'
As an aside, in SMARTS it's shorter (and I think clearer) to write
[C,c] as [#6]. It also produces a query that runs a bit quicker, but
you probably won't notice that difference in most cases.
> If I replace with < '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'>>,
> I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get
> the behaviour I want, but was curious if I am using the SMARTS ! operator
> incorrectly in conjunction with atomic numbers, or whether this may be a
> bug?
Not really a bug. The behavior when you have queries in the products
is undocumented: depending on the details of the query it will
sometimes do the right thing, sometimes not. It's much safer to just
use "*". What I probably should do is add a warning message if the
reaction contains a query in the products, I will think about this.
Best Regards,
-greg
[1] The underlying problem isn't actually generating the rxn files
themselves, they are just a collection of mol blocks with a bit of
extra verbiage sprinkled around. The problem is generating reasonable
mol blocks for molecules with query features. I already have a feature
request in for that one
(http://sourceforge.net/tracker/?group_id=160139&atid=814653), but it
turns out to not be quite as easy as it sounds.
--
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss