Dear James,

On Thu, Sep 16, 2010 at 8:01 PM, James Davidson <j.david...@vernalis.com> wrote:
>
> I have attached the python-script that I have at the moment (a) in case it
> is of some use to anybody else, (b) in the hope that I can improve my python
> and rdkit abilities through any suggested alterations (I'm sure there are
> many!), and (c) to form the basis of a couple of questions.  At the moment,
> the script is just running through each compound; checking if the molecule
> is valid; and if so, noting how many components, and whether any of the
> atoms are outside of the desired list.  These two results are then written
> out to a new SDF.  I am then using this to make sure my data-set contains
> only compounds that I would say are 'reasonable' to build a melting-point
> model with.  Now for the questions:

Thanks for sending along the script. I haven't been through it yet but
I will try and find some time later for that.

> 1.  In RDKit, has the 'cleaning / washing / salt-stripping' of molecules
> already been formalised based on a set of rules, etc?

Not that I'm aware of on the open-source side of things. All of the
functionality required to do this is, I believe, present in the RDKit
though.

> 2.  When identifying compounds that contain a non-allowed atom-type, why do
> I find the SMARTS def [!H;!C;!N;!O;!F;!S;!Cl;!Br;!I] gives unexpected
> results, but [!#1;!#6;!#7;!#8;!#9;!#16;!#17;!#35;!#53] works as I would
> expect?

This is a fairly common SMARTS "gotcha": in SMARTS the query "[C]"
means "aliphatic C". This leads to the following behavior:
[3]>>> 
Chem.MolFromSmiles('c1ccccc1').GetSubstructMatches(Chem.MolFromSmarts('[!C]'))
Out[3] ((0,), (1,), (2,), (3,), (4,), (5,))
If you want to be sure that your SMARTS will capture aliphatic or
aromatic atoms, you need to provide the atomic numbers, as in your
second query:
[4]>>> 
Chem.MolFromSmiles('c1ccccc1').GetSubstructMatches(Chem.MolFromSmarts('[!#6]'))
Out[4] ()

Best Regards,
-greg

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to