Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Hi James, On Mon, Jun 21, 2010 at 5:54 PM, James Davidson wrote: > Thanks Greg - this is great! I must confess, I was eager to try this out > asap - but have not built rdkit before. I did start having a go over the > weekend on my home PC (Windows MCE2005) but ran into a couple of unexpected > issues with the software installs that made me think I would wait and retry > on my work PC. > > [not really relevant, but for interest - I think the problems may have been > related to the Visual Studio 2010 Express installation. The result was an > infuriating clicking in the audio when streaming live or recorded TV to an > extender!! Not an issue that I felt was easy to troubleshoot... I > reinstalled my system from a drive image backup and the problem was gone... > That's when I decided to leave well alone, as my family may not have seen the > benefit of up-to-the-minute builds at home at the expense of TV enjoyment : ) > ] :-) just an FYI: I haven't built the RDKit with VS2010 yet (my windows box isn't up-to-date enough for it, I don't think), so there may be some teething problems there even with a working install. > I will get my PC at work setup to build from SVN snapshots - but I was very > pleased to see your post > (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01097.html) > saying that Q2 binaries should be available next week - great news! > Let me know if you run into problems with the windows build on the work machine. -greg -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Thanks Greg - this is great! I must confess, I was eager to try this out asap - but have not built rdkit before. I did start having a go over the weekend on my home PC (Windows MCE2005) but ran into a couple of unexpected issues with the software installs that made me think I would wait and retry on my work PC. [not really relevant, but for interest - I think the problems may have been related to the Visual Studio 2010 Express installation. The result was an infuriating clicking in the audio when streaming live or recorded TV to an extender!! Not an issue that I felt was easy to troubleshoot... I reinstalled my system from a drive image backup and the problem was gone... That's when I decided to leave well alone, as my family may not have seen the benefit of up-to-the-minute builds at home at the expense of TV enjoyment : ) ] I will get my PC at work setup to build from SVN snapshots - but I was very pleased to see your post (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01097.html) saying that Q2 binaries should be available next week - great news! Kind regards James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: 18 June 2010 06:08 To: rdkit-discuss@lists.sourceforge.net Cc: James Davidson Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts Dear all, A followup/update on a request from a couple weeks ago: On Fri, Jun 4, 2010 at 6:13 AM, Greg Landrum wrote: > On Thu, Jun 3, 2010 at 7:51 PM, James Davidson > wrote: >> >> (1) I see that the reaction objects can be created from MDL Reaction >> Files/Blocks - is there a way to do the reverse, and save a reaction >> object in MDL .rxn format? I tried using investigating the >> rxn.ToBinary() attribute, but didn't get very far... The reason I >> wanted to do this, is that I was trying to figure-out how to generate >> a form of the reaction object (generated from reaction SMARTS) that >> was suitable for converting into a 2D depiction of the transformation. > > At the moment the reactions are essentially input-only. There's really > no way to get them out in any format that could be used elsewhere. > This is a sadly missing feature: it would be really nice to be able to > generate either .rxn files (or at least reaction smarts) from > reactions. I will add a feature request for this, but it may take a > while to happen.[1] I've added a partial solution to this that at least provides some help with visualizing reactions. Here's my reaction: [12]>>> rxn = AllChem.ReactionFromSmarts('[C:1](=[O:2])-[O;-,H].[N;!$(N-C=[O,N,S]);!$(N=*):3]>>[C:1](=[O:2])-[N:3]') You can now output reaction smarts: [13]>>> AllChem.ReactionToSmarts(rxn) Out[13] '[C:1](=[O:2])-[O;-,H1].[N;!$(N-C=[O,N,S]);!$(N=*):3]>>[C:1](=[O:2])-[N:3]' You can also generate coordinates for a reaction and the create an rxn file: [14]>>> AllChem.Compute2DCoordsForReaction(rxn) [15]>>> print AllChem.ReactionToRxnBlock(rxn) --> print(AllChem.ReactionToRxnBlock(rxn)) $RXN RDKit 2 1 $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 -0.0.0. C 0 0 0 0 0 0 0 0 0 1 0 0 -0. -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 -0.1.50000. * 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 1 3 1 0 V3 [O;-,H1] M END $MOL RDKit 2D 1 0 0 0 0 0 0 0 0 0999 V2000 0.50000.0. * 0 0 0 0 0 0 0 0 0 3 0 0 V1 [N;!$(N-C=[O,N,S]);!$(N=*):3] M END $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 1.50000.0. C 0 0 0 0 0 0 0 0 0 1 0 0 1.5000 -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 1.50001.50000. N 0 0 0 0 0 0 0 0 0 3 0 0 1 2 2 0 1 3 1 0 M END # Notice that query features on atoms in the rxn blocks are not output as property ctab query features. Instead I use the atom-value feature of ctabs and output the SMARTS query for the atoms. This has the marked disadvantage that it won't actually generate reactions that do sensible things in other tools, but at least you can do some debugging of reactions. At some point in the future it would be nice to have ctab queries handled correctly, but this is at least something. These changes are checked into subversion and will be in the next release. Best Regards, -greg __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Dear all, A followup/update on a request from a couple weeks ago: On Fri, Jun 4, 2010 at 6:13 AM, Greg Landrum wrote: > On Thu, Jun 3, 2010 at 7:51 PM, James Davidson > wrote: >> >> (1) I see that the reaction objects can be created from MDL Reaction >> Files/Blocks - is there a way to do the reverse, and save a reaction object >> in MDL .rxn format? I tried using investigating the rxn.ToBinary() >> attribute, but didn't get very far... The reason I wanted to do this, is >> that I was trying to figure-out how to generate a form of the reaction >> object (generated from reaction SMARTS) that was suitable for converting >> into a 2D depiction of the transformation. > > At the moment the reactions are essentially input-only. There's really > no way to get them out in any format that could be used elsewhere. > This is a sadly missing feature: it would be really nice to be able to > generate either .rxn files (or at least reaction smarts) from > reactions. I will add a feature request for this, but it may take a > while to happen.[1] I've added a partial solution to this that at least provides some help with visualizing reactions. Here's my reaction: [12]>>> rxn = AllChem.ReactionFromSmarts('[C:1](=[O:2])-[O;-,H].[N;!$(N-C=[O,N,S]);!$(N=*):3]>>[C:1](=[O:2])-[N:3]') You can now output reaction smarts: [13]>>> AllChem.ReactionToSmarts(rxn) Out[13] '[C:1](=[O:2])-[O;-,H1].[N;!$(N-C=[O,N,S]);!$(N=*):3]>>[C:1](=[O:2])-[N:3]' You can also generate coordinates for a reaction and the create an rxn file: [14]>>> AllChem.Compute2DCoordsForReaction(rxn) [15]>>> print AllChem.ReactionToRxnBlock(rxn) --> print(AllChem.ReactionToRxnBlock(rxn)) $RXN RDKit 2 1 $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 -0.0.0. C 0 0 0 0 0 0 0 0 0 1 0 0 -0. -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 -0.1.50000. * 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 1 3 1 0 V3 [O;-,H1] M END $MOL RDKit 2D 1 0 0 0 0 0 0 0 0 0999 V2000 0.50000.0. * 0 0 0 0 0 0 0 0 0 3 0 0 V1 [N;!$(N-C=[O,N,S]);!$(N=*):3] M END $MOL RDKit 2D 3 2 0 0 0 0 0 0 0 0999 V2000 1.50000.0. C 0 0 0 0 0 0 0 0 0 1 0 0 1.5000 -1.50000. O 0 0 0 0 0 0 0 0 0 2 0 0 1.50001.50000. N 0 0 0 0 0 0 0 0 0 3 0 0 1 2 2 0 1 3 1 0 M END # Notice that query features on atoms in the rxn blocks are not output as property ctab query features. Instead I use the atom-value feature of ctabs and output the SMARTS query for the atoms. This has the marked disadvantage that it won't actually generate reactions that do sensible things in other tools, but at least you can do some debugging of reactions. At some point in the future it would be nice to have ctab queries handled correctly, but this is at least something. These changes are checked into subversion and will be in the next release. Best Regards, -greg -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Thanks for the help, Greg - my reaction SMARTS are now behaving themselves! I must confess, I had not actually realised that the documentation from install (ie the 'Book') was different to the 'Getting Started' one that I had linked from the website. Kind regards, James -Original Message- From: Greg Landrum [mailto:greg.land...@gmail.com] Sent: Fri 04/06/2010 05:13 To: James Davidson Cc: rdkit-discuss@lists.sourceforge.net Subject: Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts Dear James, On Thu, Jun 3, 2010 at 7:51 PM, James Davidson wrote: > > First of all, I'd like to start by saying how much I've been enjoying > exploring the functionality of RDKit - great job, Greg! Thanks! > I have a couple of questions regarding > 'rdkit.Chem.AllChem.ReactionFromSmarts': > > (1) I see that the reaction objects can be created from MDL Reaction > Files/Blocks - is there a way to do the reverse, and save a reaction object > in MDL .rxn format? I tried using investigating the rxn.ToBinary() > attribute, but didn't get very far... The reason I wanted to do this, is > that I was trying to figure-out how to generate a form of the reaction > object (generated from reaction SMARTS) that was suitable for converting > into a 2D depiction of the transformation. At the moment the reactions are essentially input-only. There's really no way to get them out in any format that could be used elsewhere. This is a sadly missing feature: it would be really nice to be able to generate either .rxn files (or at least reaction smarts) from reactions. I will add a feature request for this, but it may take a while to happen.[1] A workaround that kind of works is to paste the reaction smarts into something like Marvin Sketch. It will normally display something that at least gives some idea of what the reaction is. > (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some > behaviour that I would not expect - however, this could be down to my > SMARTS-naivety; my SMIRKS-naivety; or both! Anytime reactions behave in ways you don't expect, it's probably best to just blame me for coming up with yet another way of expressing them that is slightly incompatible with the existing ones. :-) > I initially tried the > following: > > from rdkit import Chem > from rdkit.Chem import AllChem > rxn_smarts = > '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]' > sm = Chem.MolFromSmiles('CC(=O)NC') > rxn = AllChem.ReactionFromSmarts(rxn_smarts) > prods = rxn.RunReactants((sm,)) > prod = Chem.MolToSmiles(prod[0][0]) > > > This gives me prod = '[H]C(=O)NC' There's a discussion of this kind of case in the "RDKit Book" ($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short answer is that if you have a query feature (atom list, recursive smarts, etc.) in the reactants and you would like the matching atom to be copied into the products you should include a dummy for that atom in the products. A working form of your example is then: [11]>>> rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[*:1]-[C:3](=[O:4])-[NH:2]-[*:5]' [12]>>> rxn = AllChem.ReactionFromSmarts(rxn_smarts) [13]>>> prods = rxn.RunReactants((Chem.MolFromSmiles('c1c1C(=O)NCC1CC1'),)) [14]>>> Chem.MolToSmiles(prods[0][0]) Out[14] 'O=C(CC1CC1)Nc1c1' As an aside, in SMARTS it's shorter (and I think clearer) to write [C,c] as [#6]. It also produces a query that runs a bit quicker, but you probably won't notice that difference in most cases. > If I replace with < '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'>>, > I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get > the behaviour I want, but was curious if I am using the SMARTS ! operator > incorrectly in conjunction with atomic numbers, or whether this may be a > bug? Not really a bug. The behavior when you have queries in the products is undocumented: depending on the details of the query it will sometimes do the right thing, sometimes not. It's much safer to just use "*". What I probably should do is add a warning message if the reaction contains a query in the products, I will think about this. Best Regards, -greg [1] The underlying problem isn't actually generating the rxn files themselves, they are just a collection of mol blocks with a bit of extra verbiage sprinkled around. The problem is generating reasonable mol blocks for molecules with query features. I already have a feature request in for that one (http://sourceforge.net/tracker/?group_id=160139&atid=814653), but it turns out to not be quite as easy as it sou
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
On Fri, Jun 4, 2010 at 9:31 AM, James Davidson wrote: > Thanks for the help, Greg - my reaction SMARTS are now behaving themselves! Excellent. I'm glad to hear it. > I must confess, I had not actually realised that the documentation from > install (ie the 'Book') was different to the 'Getting Started' one that I > had linked from the website. Yeah, the "Book" is more of a theory manual while the "getting started" thing is a tutorial of sorts. Both are evolving (slowly) as I have time to go back and add/correct things. -greg -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Dear James, On Thu, Jun 3, 2010 at 7:51 PM, James Davidson wrote: > > First of all, I'd like to start by saying how much I've been enjoying > exploring the functionality of RDKit - great job, Greg! Thanks! > I have a couple of questions regarding > 'rdkit.Chem.AllChem.ReactionFromSmarts': > > (1) I see that the reaction objects can be created from MDL Reaction > Files/Blocks - is there a way to do the reverse, and save a reaction object > in MDL .rxn format? I tried using investigating the rxn.ToBinary() > attribute, but didn't get very far... The reason I wanted to do this, is > that I was trying to figure-out how to generate a form of the reaction > object (generated from reaction SMARTS) that was suitable for converting > into a 2D depiction of the transformation. At the moment the reactions are essentially input-only. There's really no way to get them out in any format that could be used elsewhere. This is a sadly missing feature: it would be really nice to be able to generate either .rxn files (or at least reaction smarts) from reactions. I will add a feature request for this, but it may take a while to happen.[1] A workaround that kind of works is to paste the reaction smarts into something like Marvin Sketch. It will normally display something that at least gives some idea of what the reaction is. > (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some > behaviour that I would not expect - however, this could be down to my > SMARTS-naivety; my SMIRKS-naivety; or both! Anytime reactions behave in ways you don't expect, it's probably best to just blame me for coming up with yet another way of expressing them that is slightly incompatible with the existing ones. :-) > I initially tried the > following: > > from rdkit import Chem > from rdkit.Chem import AllChem > rxn_smarts = > '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]' > sm = Chem.MolFromSmiles('CC(=O)NC') > rxn = AllChem.ReactionFromSmarts(rxn_smarts) > prods = rxn.RunReactants((sm,)) > prod = Chem.MolToSmiles(prod[0][0]) > > > This gives me prod = '[H]C(=O)NC' There's a discussion of this kind of case in the "RDKit Book" ($RDBASE/Docs/Book/RDKit_Book.pdf) starting on page 3. The short answer is that if you have a query feature (atom list, recursive smarts, etc.) in the reactants and you would like the matching atom to be copied into the products you should include a dummy for that atom in the products. A working form of your example is then: [11]>>> rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[*:1]-[C:3](=[O:4])-[NH:2]-[*:5]' [12]>>> rxn = AllChem.ReactionFromSmarts(rxn_smarts) [13]>>> prods = rxn.RunReactants((Chem.MolFromSmiles('c1c1C(=O)NCC1CC1'),)) [14]>>> Chem.MolToSmiles(prods[0][0]) Out[14] 'O=C(CC1CC1)Nc1c1' As an aside, in SMARTS it's shorter (and I think clearer) to write [C,c] as [#6]. It also produces a query that runs a bit quicker, but you probably won't notice that difference in most cases. > If I replace with < '[!H:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5]'>>, > I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get > the behaviour I want, but was curious if I am using the SMARTS ! operator > incorrectly in conjunction with atomic numbers, or whether this may be a > bug? Not really a bug. The behavior when you have queries in the products is undocumented: depending on the details of the query it will sometimes do the right thing, sometimes not. It's much safer to just use "*". What I probably should do is add a warning message if the reaction contains a query in the products, I will think about this. Best Regards, -greg [1] The underlying problem isn't actually generating the rxn files themselves, they are just a collection of mol blocks with a bit of extra verbiage sprinkled around. The problem is generating reasonable mol blocks for molecules with query features. I already have a feature request in for that one (http://sourceforge.net/tracker/?group_id=160139&atid=814653), but it turns out to not be quite as easy as it sounds. -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] A couple of questions regarding ReactionFromSmarts
Hi, First of all, I'd like to start by saying how much I've been enjoying exploring the functionality of RDKit - great job, Greg! I have a couple of questions regarding 'rdkit.Chem.AllChem.ReactionFromSmarts': (1) I see that the reaction objects can be created from MDL Reaction Files/Blocks - is there a way to do the reverse, and save a reaction object in MDL .rxn format? I tried using investigating the rxn.ToBinary() attribute, but didn't get very far... The reason I wanted to do this, is that I was trying to figure-out how to generate a form of the reaction object (generated from reaction SMARTS) that was suitable for converting into a 2D depiction of the transformation. (2) I know that reaction SMARTS isn't SMIRKS, but I have noticed some behaviour that I would not expect - however, this could be down to my SMARTS-naivety; my SMIRKS-naivety; or both! I initially tried the following: from rdkit import Chem from rdkit.Chem import AllChem rxn_smarts = '[!#1:1]-[NH:2]-[C:3](=[O:4])-[C,c:5]>>[!#1:1]-[C:3](=[O:4])-[NH:2]-[C,c :5]' sm = Chem.MolFromSmiles('CC(=O)NC') rxn = AllChem.ReactionFromSmarts(rxn_smarts) prods = rxn.RunReactants((sm,)) prod = Chem.MolToSmiles(prod[0][0]) This gives me prod = '[H]C(=O)NC' If I replace with <>[!H:1]-[C:3](=[O:4])-[NH:2]-[C,c:5 ]'>>, I get the behaviour I want - with prod = 'CNC(=O)C'. So I think I can get the behaviour I want, but was curious if I am using the SMARTS ! operator incorrectly in conjunction with atomic numbers, or whether this may be a bug? Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies Oakdene Court 613 Reading Road Winnersh, Berkshire RG41 5UA. Tel: +44 118 977 3133 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the "Company address and registration details" link at the bottom of the page.. __-- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss