Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-07 Thread James Davidson
Hi Greg,

 Correct, relative (or other forms of enhanced) stereochemistry is not
 possible. It's worth talking about how to deal with this, but it's
going to be
 more than a little bit of work, I suspect.

I suspect so, too!


 The conversation about representation of and handling of enhanced
 stereochemistry, and what the actual use cases are, would be a good
one to
 have. I think it's probably going to be difficult via email though.
Maybe a topic
 for the UGM...

I agree re: email.  A topic for discussion at the UGM sounds like a very
good idea - that gives everybody 6 months to mull it over!


Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__

--
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-06 Thread Greg Landrum
On Wed, Apr 3, 2013 at 9:15 AM, James Davidson j.david...@vernalis.com wrote:

 If you call the reaction with non-chiral starting material, you get
 non-chiral
 ouput:

 In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1])

 In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),))

 In [22]: Chem.MolToSmiles(ps[0][0],True)
 Out[22]: 'FC(Cl)(Br)I'

 This is probably also ok; it certainly reflects what would happen in
 the lab (er,
 at least I think it does).

 Just to be a pedant for a moment (but actually, this could be important
 later) - this is actually calling the reaction with *chiral* (albeit
 presumably racemic) starting material

Thanks, you're obviously right. Precision is important here and I was
being sloppy.

 So far so good. We've got inversion of stereochemistry and retention
 of
 stereochemistry. There are two cases left: resolution/creation and
 scrambling.

 One obvious thing to do here would be:

   [C@:1][C:1]   scrambling
   [C:1][C@:1]   resolution/induction

 This is where my extremely bogus example starts to make things more
 difficult to understand, so here's a more real example of the
 induction case:
[#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4]

 Seem right?

 Can of worms alert 1!!  At first sight this seems perfectly ok(?) -
 as long as we accept that we know what we mean by the (R) flags on the
 carbons (by my reckoning we probably mean syn addition of Br2 across a
 double-bond?).

Yeah, it's syn addition (which is, I guess, also not particularly
realistic... but that's a different story).

 But - problems of symmetry and atom priorities aside(!)
 - what do I do if I want to employ the same transformation but with no
 absolute stereo-control (ie if I don't have the same wonder-catalyst)?
 At the moment I guess there is no way to represent relative
 stereochemistry in the absence of an enhanced stereochemistry model?

Correct, relative (or other forms of enhanced) stereochemistry is not
possible. It's worth talking about how to deal with this, but it's
going to be more than a little bit of work, I suspect.


 This brings me on to the main can of worms sensation - and I think it
 may revolve trying to service both real and 'virtual/fake' reactions in
 the same system, as well as some obvious concerns about enhanced
 stereochemistry.  So some examples / questions:

 1.  I have a super-useful enzyme that will only hydrolyse (R)-esters (or
 more precisely I should say it won't hydrolyse (S)-esters).  So:

 CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed
 CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC  ## S doesn't
 CCC(C)C(=O)OC ## Oh dear, what do we want to happen here?  I know what
 my enzyme will do - but we do have to assume that we are implying a
 racemic mix (it gets more worrying if we might mean a single, but
 unknown, enantiomer, or we might know nothing at all - we're back to
 enhanced stereochemistry again!)
 CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC  ## So this is
 what the enzyme would do - because we have treated the chiral centre as
 a racemic mix - essentially expanding out to:
 CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O
 C

 The problem with this is that it doesn't fit with the existing rSMARTS
 nomenclature for retention and inversion, because the absolute
 stereochemistry of the starting material affects the outcome of the
 reaction!  But I guess my enzyme reaction above would be represented as
 something like

 [C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H

 But we would have to (a) assume now that '@' in the starting material
 only matched (R), and (b) treat incoming racemates intrinsically as
 two-component mixtures of (R) and (S) to then apply the transformation
 to just the (R) and add the (S) starting material to the products...

The (R) or (S) parts are currently no doable (the stereochemistry in
SMILES/SMARTS is expressed in terms of the ordering of the neighbors,
not their CIP priorities. Setting that aside for the moment, there's
another problem: the current reaction code is not using the
stereochemistry information in the reactant templates when doing
substructure matches. That's possible, but in order for it to work,
you would definitely need to specify all four neighbors of an atom in
a way that the @ stereoisomer couldn't match the @@. This requires
being pretty specific about what the reactants are; probably not a
problem for enzymatic reactions or many other explicit reactions
(what you'd draw in an ELN: a specific instance of a class of
reactions applied to specific reactants), but it's unlikely to be
useful for reaction templates (here's a general description of a
Suzuki coupling).



 2.  I am a database admin, and I want to transform some mis-assigned
 racemates to the (S) enantiomers

 Eg CCC(C)C(=O)OCCC[C@@H](C)C(=O)OC

 But extending the concept of treating racemates intrinsically as (R)/(S)
 mixtures, does this mean I should apply:

 

Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-03 Thread James Davidson
Hi Greg

 
 I should have provided a bit more context around what the current
behavior
 is, or at least what it's supposed to be. Sorry I forgot that.

My fault - I should have (re)read the manual (I thought it seemed a bit
familiar..!)


 Currently, when creating a reaction from rxnSMARTS,
inversion/retention is
 handled by looking at the relative stereochemistry of atoms in the
reactants
 and products.
 
 If they're different you get inversion (apologies for the extremely
bogus
 example reaction):
 
 In [13]: rxn = AllChem.ReactionFromSmarts([C@:1][C@@:1])
 
 In [14]: ps =
rxn.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [15]: Chem.MolToSmiles(ps[0][0],True)
 Out[15]: 'F[C@@](Cl)(Br)I'
 
 In [16]: ps =
rxn.RunReactants((Chem.MolFromSmiles('F[C@@](Cl)(Br)I'),))
 
 In [17]: Chem.MolToSmiles(ps[0][0],True)
 Out[17]: 'F[C@](Cl)(Br)I'
 
 and if they're the same you get retention:
 
 In [7]: rxn2 = AllChem.ReactionFromSmarts([C@:1][C@:1])
 
 In [8]: ps =
rxn2.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [9]: Chem.MolToSmiles(ps[0][0],True)
 Out[9]: 'F[C@](Cl)(Br)I'
 
 In [10]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1])
 
 In [11]: ps =
rxn3.RunReactants((Chem.MolFromSmiles('F[C@](Cl)(Br)I'),))
 
 In [12]: Chem.MolToSmiles(ps[0][0],True)
 Out[12]: 'F[C@](Cl)(Br)I'
 
 
 This much feels logical to me, though of course it can be changed if
there's
 disagreement.

It sort of does to me too, but I can't shift the sensation that there
might be a can of worms here - more on that in a moment...


 If you call the reaction with non-chiral starting material, you get
non-chiral
 ouput:
 
 In [20]: rxn3 = AllChem.ReactionFromSmarts([C@@:1][C@@:1])
 
 In [21]: ps = rxn3.RunReactants((Chem.MolFromSmiles('FC(Cl)(Br)I'),))
 
 In [22]: Chem.MolToSmiles(ps[0][0],True)
 Out[22]: 'FC(Cl)(Br)I'
 
 This is probably also ok; it certainly reflects what would happen in
the lab (er,
 at least I think it does).

Just to be a pedant for a moment (but actually, this could be important
later) - this is actually calling the reaction with *chiral* (albeit
presumably racemic) starting material


 So far so good. We've got inversion of stereochemistry and retention
of
 stereochemistry. There are two cases left: resolution/creation and
 scrambling.
 
 One obvious thing to do here would be:
 
   [C@:1][C:1]   scrambling
   [C:1][C@:1]   resolution/induction
 
 This is where my extremely bogus example starts to make things more
 difficult to understand, so here's a more real example of the
induction case:
[#6:1]/[C:2]=[C:3](/[#6:4])[#6:1][C@H:2](Br)[C@H:3](Br)[#6:4]
 
 Seem right?

Can of worms alert 1!!  At first sight this seems perfectly ok(?) -
as long as we accept that we know what we mean by the (R) flags on the
carbons (by my reckoning we probably mean syn addition of Br2 across a
double-bond?).  But - problems of symmetry and atom priorities aside(!)
- what do I do if I want to employ the same transformation but with no
absolute stereo-control (ie if I don't have the same wonder-catalyst)?
At the moment I guess there is no way to represent relative
stereochemistry in the absence of an enhanced stereochemistry model?

This brings me on to the main can of worms sensation - and I think it
may revolve trying to service both real and 'virtual/fake' reactions in
the same system, as well as some obvious concerns about enhanced
stereochemistry.  So some examples / questions:

1.  I have a super-useful enzyme that will only hydrolyse (R)-esters (or
more precisely I should say it won't hydrolyse (S)-esters).  So:

CC[C@H](C)C(=O)OCCC[C@H](C)C(=O)O ## R gets hydrolysed
CC[C@@H](C)C(=O)OCCC[C@@H](C)C(=O)OC  ## S doesn't
CCC(C)C(=O)OC ## Oh dear, what do we want to happen here?  I know what
my enzyme will do - but we do have to assume that we are implying a
racemic mix (it gets more worrying if we might mean a single, but
unknown, enantiomer, or we might know nothing at all - we're back to
enhanced stereochemistry again!)
CCC(C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)OC  ## So this is
what the enzyme would do - because we have treated the chiral centre as
a racemic mix - essentially expanding out to:
CC[C@H](C)C(=O)OC.CC[C@@H](C)C(=O)OCCC[C@H](C)C(=O)O.CC[C@@H](C)C(=O)O
C

The problem with this is that it doesn't fit with the existing rSMARTS
nomenclature for retention and inversion, because the absolute
stereochemistry of the starting material affects the outcome of the
reaction!  But I guess my enzyme reaction above would be represented as
something like

[C@:1][C:2](=[O:3])[O:4]C[C@:1][C:2](=[O:3])[O:4]H

But we would have to (a) assume now that '@' in the starting material
only matched (R), and (b) treat incoming racemates intrinsically as
two-component mixtures of (R) and (S) to then apply the transformation
to just the (R) and add the (S) starting material to the products...


2.  I am a database admin, and I want to transform some mis-assigned
racemates to the (S) enantiomers

Eg 

Re: [Rdkit-discuss] Handling reaction stereochemistry

2013-04-02 Thread James Davidson
Hi Greg,

 

 I've got a question for the community about how chirality should be
handled in reactions.

 This morning I managed to fix one of the outstanding reaction
stereochemistry problems in the RDKit: the loss of chirality when one
bond to a stereocenter is to an unmapped atom. Here's a quick demo of
the new behavior (not yet checked in; there are still a couple things to
be cleaned up):

 

 In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S')

 In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),))

 In [9]: Chem.MolToSmiles(ps[0][0],True)

 Out[9]: 'F[C@H](S)Cl'

 

 It seems nice to be able to preserve chirality in these cases.

 The question that comes up is: *Should* we be preserving chirality in
these cases?. The change makes it impossible to indicate a reaction
that scrambles stereochemistry. That doesn't seem right.

 So... the question to you guys: How should stereochemistry
inversion/retention/loss be indicated in Reaction SMARTS?

 

 

Good question - and let me be the first to jump in, feet first, without
thinking enough!  : )

Instinctively, I would say it would be good to (a) scramble
stereochemistry if not otherwise specified - at least this way we
default to losing information rather than risking keeping incorrect
information; (b) use a flag at each centre if we want to retain
stereochemistry (what about '@'?); (c) use another flag if we want to
invert (and, inventive I know, what about '@@'?).

 

So in the above example, let's say I want to always invert (eg to
represent an SN2 reaction) - the rSMARTS could then be something like
[C:1]-O[C:1@@]-S, and the example input above would give F[C@@H](S)Cl
out.

The same output with no specification could give FC(H)(S)Cl and, of
course, achiral input would always give achiral output - regardless of
the flag in the rSMARTS.

 

 

 Bonus points to anyone who can explain to me how the
inversion/retention flags in RXN files should be handled. At the moment
the RDKit uses what's in the products and ignores them in the reactants.

 

Something like the above? (I told you I hadn't thought about it enough!)

 

Kind regards

 

James


__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Handling reaction stereochemistry

2013-04-01 Thread Greg Landrum
Dear all,

I've got a question for the community about how chirality should be
handled in reactions.

This morning I managed to fix one of the outstanding reaction
stereochemistry problems in the RDKit: the loss of chirality when one
bond to a stereocenter is to an unmapped atom. Here's a quick demo of
the new behavior (not yet checked in; there are still a couple things
to be cleaned up):

In [7]: rxn = AllChem.ReactionFromSmarts('[C:1]-O[C:1]-S')

In [8]: ps = rxn.RunReactants((Chem.MolFromSmiles('F[C@H](O)Cl'),))

In [9]: Chem.MolToSmiles(ps[0][0],True)
Out[9]: 'F[C@H](S)Cl'

It seems nice to be able to preserve chirality in these cases.

The question that comes up is: *Should* we be preserving chirality in
these cases?. The change makes it impossible to indicate a reaction
that scrambles stereochemistry. That doesn't seem right.

So... the question to you guys: How should stereochemistry
inversion/retention/loss be indicated in Reaction SMARTS?

Bonus points to anyone who can explain to me how the
inversion/retention flags in RXN files should be handled. At the
moment the RDKit uses what's in the products and ignores them in the
reactants.

-greg

--
Own the Future-Intel(R) Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest. Compete 
for recognition, cash, and the chance to get your game on Steam. 
$5K grand prize plus 10 genre and skill prizes. Submit your demo 
by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss