[Rdkit-discuss] Sample RD Files?

2011-05-27 Thread Greg Landrum
Dear all,

I'm planning to add an RD file parser to the RDKit, but I'm having a
difficult time finding a good public source of RD files to use for
testing. The only thing I've managed to find so far is EBI's RHEA
(http://www.ebi.ac.uk/rhea//home.xhtml). This has a large number of
reactions, but they are unfortunately not mapped, so they are of
limited use for testing.

Can anyone provide me with, or point me to, some sample RD files?
Particularly welcome would be files that include mapped reactions.

-greg

--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Sample RD Files?

2011-06-08 Thread Greg Landrum
[Replying to the list since this may be of general interest]

On Wed, Jun 8, 2011 at 10:00 AM, James Davidson  wrote:
>
> Apologies for the confusion - I will try to explain by example:
>
 from rdkit import Chem
 from rdkit.Chem import AllChem
 rsmarts="[C;X4;!H0:1][a:2]>>[*:1](Br)[*:2]"
 rxn=AllChem.ReactionFromSmarts(rsmarts)
 print AllChem.ReactionToRxnBlock(rxn)
> $RXN
>
>   RDKit
>
>   1  1
> $MOL
>
>  RDKit
>
>   2  1  0  0  0  0  0  0  0  0999 V2000
>     0.    0.    0. *   0  0  0  0  0  0  0  0  0  1  0  0
>     0.    0.    0. *   0  0  0  0  0  0  0  0  0  2  0  0
>   1  2  1  0
> V    1 [C&X4&!H0:1]
> V    2 [a:2]
> M  END
> $MOL
>
>  RDKit
>
>   3  2  0  0  0  0  0  0  0  0999 V2000
>     0.    0.    0. *   0  0  0  0  0  0  0  0  0  1  0  0
>     0.    0.    0. Br  0  0  0  0  0  0  0  0  0  0  0  0
>     0.    0.    0. *   0  0  0  0  0  0  0  0  0  2  0  0
>   2  1  1  0
>   1  3  1  0
> V    1 [*:1]
> V    3 [*:2]
> M  END
>
>
>
> So, we have a rxn that can be outputted as an atom-mapped reaction block
> (carrying query features).  What I want is the corresponding atom-mapping,
> but applied to the reactant and product of cases where the reactant is a
> match:
>
 smiles="c1c1C"
 mol=Chem.MolFromSmiles(smiles)
 prods = rxn.RunReactants((mol,))
 prod = prods[0][0]
>
> So in this case I would like to get at the following (pseudo reaction for
> illustrative purposes):
>
> c1c{1}1C{2} >> c1c{1}1C{2}Br
>
> (ie the 'specific' reaction, but carrying the atom numbers from the template
> - {n} denoting atom-map)
>
> As I said before, I am a bit rusty on this - so I may very well be missing
> something obvious!

This isn't super straightforward, but it is do-able. I'll show it for
this specific example, if you have problems generalizing, we can
iterate here.

# First get everything setup as you did above, including getting a product:
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> rsmarts="[C;X4;!H0:1][a:2]>>[*:1](Br)[*:2]"
>>> rxn=AllChem.ReactionFromSmarts(rsmarts)
>>> smiles="c1c1C"
>>> mol=Chem.MolFromSmiles(smiles)
>>> prods = rxn.RunReactants((mol,))
>>> prod = prods[0][0]
>>> Chem.SanitizeMol(prod)

# Now get copies of the reactant and product templates from the reaction:
>>> rtmpl = rxn.GetReactantTemplate(0)
>>> ptmpl = rxn.GetProductTemplate(0)

# Find the mapping of the reactant template onto the reactant molecule:
>>> match = mol.GetSubstructMatch(rtmpl)

# copy the atom mapping information from the template to the reactant molecule:
>>> for tmplId,molId in enumerate(match):
...if rtmpl.GetAtomWithIdx(tmplId).HasProp('molAtomMapNumber'):
...   at = mol.GetAtomWithIdx(molId)
...   
at.SetProp('molAtomMapNumber',rtmpl.GetAtomWithIdx(tmplId).GetProp('molAtomMapNumber'))
...   # this is a hack. we need a query associated with an atom to
see the mapping info in SMARTS :
...   Chem.AddRecursiveQuery(mol,Chem.MolFromSmarts('*'),molId)

# repeat that process for the product:
>>> match = prod.GetSubstructMatch(ptmpl)

# copy the atom mapping information from the template to the reactant molecule:
>>> for tmplId,molId in enumerate(match):
...if ptmpl.GetAtomWithIdx(tmplId).HasProp('molAtomMapNumber'):
...   at = prod.GetAtomWithIdx(molId)
...   
at.SetProp('molAtomMapNumber',ptmpl.GetAtomWithIdx(tmplId).GetProp('molAtomMapNumber'))
...   # this is a hack. we need a query associated with an atom to
see the mapping info in SMARTS :
...   Chem.AddRecursiveQuery(prod,Chem.MolFromSmarts('*'),molId)

# test that the SMARTS for those are ok:
>>> print Chem.MolToSmarts(mol)
# output is: [#6]1:[#6]:[#6]:[#6]:[#6]:[#6&$(*):2]:1-[#6&$(*):1]
>>> print Chem.MolToSmarts(prod)
# output is: [#6&$(*):1](-[#35])-[#6&$(*):2]1:[#6]:[#6]:[#6]:[#6]:[#6]:1

# now build a new reaction:
>>> nrxn = AllChem.ChemicalReaction()
>>> nrxn.AddReactantTemplate(mol)
>>> nrxn.AddProductTemplate(prod)

# we can now get the SMARTS for the reaction:
>>> print AllChem.ReactionToSmarts(nrxn)
# output is: 
[#6]1:[#6]:[#6]:[#6]:[#6]:[#6&$(*):2]:1-[#6&$(*):1]>>[#6&$(*):1](-[#35])-[#6&$(*):2]1:[#6]:[#6]:[#6]:[#6]:[#6]:1

# let's test the reaction to make sure it works.
# due to a (already reported) bug in the way atom properties are
handled, nrxn cannot be directly used,
# so we use a hack and reparse it:
>>> nrxn = AllChem.ReactionFromSmarts(AllChem.ReactionToSmarts(nrxn))
>>> nrxn.Validate()
# now we can run a molecule through to make sure it works:
>>> nmol = Chem.MolFromSmiles('c1c1C')
>>> nps = nrxn.RunReactants((nmol,))
>>> print Chem.MolToSmiles(nps[0][0])
# output is: BrCc1c1

Is that what you're looking for?

> PS  I should say that I may have made a mistake with this example(?) because
> I kept getting an error when trying to get the MolBlock of the product that
> was rectified if I returned the Smiles first:
>
 prod = prods[0][0]
 print 

Re: [Rdkit-discuss] Sample RD Files?

2011-06-08 Thread James Davidson
Hi Greg,

Thanks for the python-full reply!

> # let's test the reaction to make sure it works.
> # due to a (already reported) bug in the way atom properties 
> are handled, nrxn cannot be directly used, # so we use a hack 
> and reparse it:
> >>> nrxn = AllChem.ReactionFromSmarts(AllChem.ReactionToSmarts(nrxn))
> >>> nrxn.Validate()
> # now we can run a molecule through to make sure it works:
> >>> nmol = Chem.MolFromSmiles('c1c1C') nps = 
> >>> nrxn.RunReactants((nmol,)) print Chem.MolToSmiles(nps[0][0])
> # output is: BrCc1c1
> 
> Is that what you're looking for?

It certainly allows me to do what I want - which is get a mapped RXN
out.  And this can even be done with coordinates - which I have added
below as a reminder to anyone (which included me until about 10 mins
ago!) who had forgotten:

AllChem.Compute2DCoordsForReaction(nrxn)
rxnBlock = AllChem.ReactionToRxnBlock(nrxn)


So Thanks very much!  : )

 - and thanks for the reminder about sanitizing products from
reactions...

> The molecules that come back from reactions have not been 
> sanitized, so all you need to do is add a call to 
> Chem.SanitizeMol first:
> 
> >>> Chem.SanitizeMol(prod)

Kind regards

James

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
Oakdene Court
613 Reading Road
Winnersh, Berkshire
RG41 5UA.
Tel: +44 118 977 3133

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the "Company address and 
registration details" link at the bottom of the page..
__

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss