Dear James,
On Thu, Sep 16, 2010 at 8:01 PM, James Davidson <[email protected]> wrote:
>
> I have attached the python-script that I have at the moment (a) in case it
> is of some use to anybody else, (b) in the hope that I can improve my python
> and rdkit abilities through any suggested alterations (I'm sure there are
> many!), and (c) to form the basis of a couple of questions. At the moment,
> the script is just running through each compound; checking if the molecule
> is valid; and if so, noting how many components, and whether any of the
> atoms are outside of the desired list. These two results are then written
> out to a new SDF. I am then using this to make sure my data-set contains
> only compounds that I would say are 'reasonable' to build a melting-point
> model with. Now for the questions:
Thanks for sending along the script. I haven't been through it yet but
I will try and find some time later for that.
> 1. In RDKit, has the 'cleaning / washing / salt-stripping' of molecules
> already been formalised based on a set of rules, etc?
Not that I'm aware of on the open-source side of things. All of the
functionality required to do this is, I believe, present in the RDKit
though.
> 2. When identifying compounds that contain a non-allowed atom-type, why do
> I find the SMARTS def [!H;!C;!N;!O;!F;!S;!Cl;!Br;!I] gives unexpected
> results, but [!#1;!#6;!#7;!#8;!#9;!#16;!#17;!#35;!#53] works as I would
> expect?
This is a fairly common SMARTS "gotcha": in SMARTS the query "[C]"
means "aliphatic C". This leads to the following behavior:
[3]>>>
Chem.MolFromSmiles('c1ccccc1').GetSubstructMatches(Chem.MolFromSmarts('[!C]'))
Out[3] ((0,), (1,), (2,), (3,), (4,), (5,))
If you want to be sure that your SMARTS will capture aliphatic or
aromatic atoms, you need to provide the atomic numbers, as in your
second query:
[4]>>>
Chem.MolFromSmiles('c1ccccc1').GetSubstructMatches(Chem.MolFromSmarts('[!#6]'))
Out[4] ()
Best Regards,
-greg
------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss