Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg, Correct, relative (or other forms of enhanced) stereochemistry is not possible. It's worth talking about how to deal with this, but it's going to be more than a little bit of work, I suspect. I suspect so, too! The conversation about representation of and handling of enhanced stereochemistry, and what the actual use cases are, would be a good one to have. I think it's probably going to be difficult via email though. Maybe a topic for the UGM... I agree re: email. A topic for discussion at the UGM sounds like a very good idea - that gives everybody 6 months to mull it over! Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Handling reaction stereochemistry
On Wed, Apr 3, 2013 at 9:15 AM, James Davidson j.david...@vernalis.com wrote: If you call the reaction with non-chiral starting material, you get non-chiral ouput: In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1]) In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),)) In [22]: Chem.MolToSmiles(ps[0][0],True) Out[22]: 'FC(Cl)(Br)I' This is probably also ok; it certainly reflects what would happen in the lab (er, at least I think it does). Just to be a pedant for a moment (but actually, this could be important later) - this is actually calling the reaction with *chiral* (albeit presumably racemic) starting material Thanks, you're obviously right. Precision is important here and I was being sloppy. So far so good. We've got inversion of stereochemistry and retention of stereochemistry. There are two cases left: resolution/creation and scrambling. One obvious thing to do here would be: [C@:1][C:1] scrambling [C:1][C@:1] resolution/induction This is where my extremely bogus example starts to make things more difficult to understand, so here's a more real example of the induction case: [#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4] Seem right? Can of worms alert 1!! At first sight this seems perfectly ok(?) - as long as we accept that we know what we mean by the (R) flags on the carbons (by my reckoning we probably mean syn addition of Br2 across a double-bond?). Yeah, it's syn addition (which is, I guess, also not particularly realistic... but that's a different story). But - problems of symmetry and atom priorities aside(!) - what do I do if I want to employ the same transformation but with no absolute stereo-control (ie if I don't have the same wonder-catalyst)? At the moment I guess there is no way to represent relative stereochemistry in the absence of an enhanced stereochemistry model? Correct, relative (or other forms of enhanced) stereochemistry is not possible. It's worth talking about how to deal with this, but it's going to be more than a little bit of work, I suspect. This brings me on to the main can of worms sensation - and I think it may revolve trying to service both real and 'virtual/fake' reactions in the same system, as well as some obvious concerns about enhanced stereochemistry. So some examples / questions: 1. I have a super-useful enzyme that will only hydrolyse (R)-esters (or more precisely I should say it won't hydrolyse (S)-esters). So: CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC ## S doesn't CCC(C)C(=O)OC ## Oh dear, what do we want to happen here? I know what my enzyme will do - but we do have to assume that we are implying a racemic mix (it gets more worrying if we might mean a single, but unknown, enantiomer, or we might know nothing at all - we're back to enhanced stereochemistry again!) CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC ## So this is what the enzyme would do - because we have treated the chiral centre as a racemic mix - essentially expanding out to: CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O C The problem with this is that it doesn't fit with the existing rSMARTS nomenclature for retention and inversion, because the absolute stereochemistry of the starting material affects the outcome of the reaction! But I guess my enzyme reaction above would be represented as something like [C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H But we would have to (a) assume now that '@' in the starting material only matched (R), and (b) treat incoming racemates intrinsically as two-component mixtures of (R) and (S) to then apply the transformation to just the (R) and add the (S) starting material to the products... The (R) or (S) parts are currently no doable (the stereochemistry in SMILES/SMARTS is expressed in terms of the ordering of the neighbors, not their CIP priorities. Setting that aside for the moment, there's another problem: the current reaction code is not using the stereochemistry information in the reactant templates when doing substructure matches. That's possible, but in order for it to work, you would definitely need to specify all four neighbors of an atom in a way that the @ stereoisomer couldn't match the @@. This requires being pretty specific about what the reactants are; probably not a problem for enzymatic reactions or many other explicit reactions (what you'd draw in an ELN: a specific instance of a class of reactions applied to specific reactants), but it's unlikely to be useful for reaction templates (here's a general description of a Suzuki coupling). 2. I am a database admin, and I want to transform some mis-assigned racemates to the (S) enantiomers Eg CCC(C)C(=O)OCCC[C@@H](C)C(=O)OC But extending the concept of treating racemates intrinsically as (R)/(S) mixtures, does this mean I should apply:
Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg I should have provided a bit more context around what the current behavior is, or at least what it's supposed to be. Sorry I forgot that. My fault - I should have (re)read the manual (I thought it seemed a bit familiar..!) Currently, when creating a reaction from rxnSMARTS, inversion/retention is handled by looking at the relative stereochemistry of atoms in the reactants and products. If they're different you get inversion (apologies for the extremely bogus example reaction): In [13]: rxn = AllChem.ReactionFromSmarts([C@:1][C@@:1]) In [14]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [15]: Chem.MolToSmiles(ps[0][0],True) Out[15]: 'F[C@@](Cl)(Br)I' In [16]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@@](Cl)(Br)I'),)) In [17]: Chem.MolToSmiles(ps[0][0],True) Out[17]: 'F[C@](Cl)(Br)I' and if they're the same you get retention: In [7]: rxn2 = AllChem.ReactionFromSmarts([C@:1][C@:1]) In [8]: ps = rxn2.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [9]: Chem.MolToSmiles(ps[0][0],True) Out[9]: 'F[C@](Cl)(Br)I' In [10]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1]) In [11]: ps = rxn3.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),)) In [12]: Chem.MolToSmiles(ps[0][0],True) Out[12]: 'F[C@](Cl)(Br)I' This much feels logical to me, though of course it can be changed if there's disagreement. It sort of does to me too, but I can't shift the sensation that there might be a can of worms here - more on that in a moment... If you call the reaction with non-chiral starting material, you get non-chiral ouput: In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1]) In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),)) In [22]: Chem.MolToSmiles(ps[0][0],True) Out[22]: 'FC(Cl)(Br)I' This is probably also ok; it certainly reflects what would happen in the lab (er, at least I think it does). Just to be a pedant for a moment (but actually, this could be important later) - this is actually calling the reaction with *chiral* (albeit presumably racemic) starting material So far so good. We've got inversion of stereochemistry and retention of stereochemistry. There are two cases left: resolution/creation and scrambling. One obvious thing to do here would be: [C@:1][C:1] scrambling [C:1][C@:1] resolution/induction This is where my extremely bogus example starts to make things more difficult to understand, so here's a more real example of the induction case: [#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4] Seem right? Can of worms alert 1!! At first sight this seems perfectly ok(?) - as long as we accept that we know what we mean by the (R) flags on the carbons (by my reckoning we probably mean syn addition of Br2 across a double-bond?). But - problems of symmetry and atom priorities aside(!) - what do I do if I want to employ the same transformation but with no absolute stereo-control (ie if I don't have the same wonder-catalyst)? At the moment I guess there is no way to represent relative stereochemistry in the absence of an enhanced stereochemistry model? This brings me on to the main can of worms sensation - and I think it may revolve trying to service both real and 'virtual/fake' reactions in the same system, as well as some obvious concerns about enhanced stereochemistry. So some examples / questions: 1. I have a super-useful enzyme that will only hydrolyse (R)-esters (or more precisely I should say it won't hydrolyse (S)-esters). So: CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC ## S doesn't CCC(C)C(=O)OC ## Oh dear, what do we want to happen here? I know what my enzyme will do - but we do have to assume that we are implying a racemic mix (it gets more worrying if we might mean a single, but unknown, enantiomer, or we might know nothing at all - we're back to enhanced stereochemistry again!) CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC ## So this is what the enzyme would do - because we have treated the chiral centre as a racemic mix - essentially expanding out to: CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O C The problem with this is that it doesn't fit with the existing rSMARTS nomenclature for retention and inversion, because the absolute stereochemistry of the starting material affects the outcome of the reaction! But I guess my enzyme reaction above would be represented as something like [C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H But we would have to (a) assume now that '@' in the starting material only matched (R), and (b) treat incoming racemates intrinsically as two-component mixtures of (R) and (S) to then apply the transformation to just the (R) and add the (S) starting material to the products... 2. I am a database admin, and I want to transform some mis-assigned racemates to the (S) enantiomers Eg
Re: [Rdkit-discuss] Handling reaction stereochemistry
Hi Greg, I've got a question for the community about how chirality should be handled in reactions. This morning I managed to fix one of the outstanding reaction stereochemistry problems in the RDKit: the loss of chirality when one bond to a stereocenter is to an unmapped atom. Here's a quick demo of the new behavior (not yet checked in; there are still a couple things to be cleaned up): In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S') In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),)) In [9]: Chem.MolToSmiles(ps[0][0],True) Out[9]: 'F[C@H](S)Cl' It seems nice to be able to preserve chirality in these cases. The question that comes up is: *Should* we be preserving chirality in these cases?. The change makes it impossible to indicate a reaction that scrambles stereochemistry. That doesn't seem right. So... the question to you guys: How should stereochemistry inversion/retention/loss be indicated in Reaction SMARTS? Good question - and let me be the first to jump in, feet first, without thinking enough! : ) Instinctively, I would say it would be good to (a) scramble stereochemistry if not otherwise specified - at least this way we default to losing information rather than risking keeping incorrect information; (b) use a flag at each centre if we want to retain stereochemistry (what about '@'?); (c) use another flag if we want to invert (and, inventive I know, what about '@@'?). So in the above example, let's say I want to always invert (eg to represent an SN2 reaction) - the rSMARTS could then be something like [C:1]-O[C:1@@]-S, and the example input above would give F[C@@H](S)Cl out. The same output with no specification could give FC(H)(S)Cl and, of course, achiral input would always give achiral output - regardless of the flag in the rSMARTS. Bonus points to anyone who can explain to me how the inversion/retention flags in RXN files should be handled. At the moment the RDKit uses what's in the products and ignores them in the reactants. Something like the above? (I told you I hadn't thought about it enough!) Kind regards James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Handling reaction stereochemistry
Dear all, I've got a question for the community about how chirality should be handled in reactions. This morning I managed to fix one of the outstanding reaction stereochemistry problems in the RDKit: the loss of chirality when one bond to a stereocenter is to an unmapped atom. Here's a quick demo of the new behavior (not yet checked in; there are still a couple things to be cleaned up): In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S') In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),)) In [9]: Chem.MolToSmiles(ps[0][0],True) Out[9]: 'F[C@H](S)Cl' It seems nice to be able to preserve chirality in these cases. The question that comes up is: *Should* we be preserving chirality in these cases?. The change makes it impossible to indicate a reaction that scrambles stereochemistry. That doesn't seem right. So... the question to you guys: How should stereochemistry inversion/retention/loss be indicated in Reaction SMARTS? Bonus points to anyone who can explain to me how the inversion/retention flags in RXN files should be handled. At the moment the RDKit uses what's in the products and ignores them in the reactants. -greg -- Own the Future-Intel(R) Level Up Game Demo Contest 2013 Rise to greatness in Intel's independent game demo contest. Compete for recognition, cash, and the chance to get your game on Steam. $5K grand prize plus 10 genre and skill prizes. Submit your demo by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2 ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss