Hi all,
I have added some extra parts to the mmpa contrib. code which has recently been
approved for open sourcing by GSK. Also, I have made some minor changes to the
existing code but it should all work in the same way.
The extra parts are:
mol_transform.py
This program applies a transform or transforms (generated by the MMP generation
program) and applies it to a user set of smiles. This final piece completes the
circle, so now you can find MMPs, pick the transforms of interest and them
apply these to a new set of compounds.
Another bigger piece I have added to the contrib. directory is some code to
build and search a MMP sqllite db. In the python program indexing.py which is
used to find all the MMPs in a given set, a data structure (which I call a
"pair index") is written memory. What I have done is generate code to write
this pair index to a relational database (sqllite). This opens up a number of
searching possibilities. A db searching program has been created where
following searches can be performed:
1) Find all MMPs of an input/query compound to the compounds in the db
2) Find all MMPs in the db where the LHS of the transform matches an input
substructure
3) Find all MMPs that match the input transform/SMIRKS
4) Find all MMPs in the db where the LHS of the transform matches an input
SMARTS
5) Find all MMPs that match the LHS and RHS SMARTS of the input transform
The SMARTS searching utilises the excellent DbCLI tools
(http://code.google.com/p/rdkit/wiki/UsingTheDbCLI) that are part of the RDKit
distribution.
You can imagine this db is more suited to problems where you want to ask
specific questions of a given compound set. Also, the SMARTS searching
capability now gives complete control of the MMPs you want to identify. This
new searching ability means that the mmpa suite of tools can do things that are
not possible with anything else out there (that I have seen commercially or
otherwise). I hope the community finds it useful.
Greg has kindly added the new code to the RDKit GitHub page and the mmpa
directory contains an extensive readme file explaining how to run all the
programs (as well as some sample data). I will be available at the upcoming
RDKit user group meeting where I'll give a tutorial on how to use the mmpa
code. I look forward to seeing you all there.
Cheers
Jameed
________________________________
This e-mail was sent by GlaxoSmithKline Services Unlimited
(registered in England and Wales No. 1047315), which is a
member of the GlaxoSmithKline group of companies. The
registered address of GlaxoSmithKline Services Unlimited
is 980 Great West Road, Brentford, Middlesex TW8 9GS.
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss