Hi Greg
Well, I managed to have a go at this earlier than I expected. So first some
apologies, provisos, and caveats to warn you, and other readers, that your eyes
will soon experience things inelegant and unpythonic, but it's the best I could
come up with, with my limited faculties and experience!
On the plus side - I think it is doing what I wanted - ie giving a count of the
number of aromatic systems (if you always want count a fused aromatic as 1
aromatic system). The downside is that the way I have done this now makes your
script eg output (6,1) for anthracene - where the 1 is the count of aromatic
systems (fused or otherwise). It would be most generic if it maybe returned
(6,3,1) as (all unique aromatic substructures, unique mono-cyclic
substructures, aromatic systems). I'm sure this is fairly straightforward, but
for another day!
So what I added was:
def GetOuterSet(rings):
# Initialise a counter for parent aromatic 'super' rings
result = 0
# Set-up a dictionary so that items can be referenced and deleted
ring_set = {}
for k, v in enumerate(rings):
ring_set[k] = v
# While there is something to process
while len(ring_set):
# Set the ring to be checked as the last in the list - should be the
biggest
reference = sorted(ring_set)[-1]
for k,v in sorted(ring_set.iteritems()):
# if current item is contained in last item - remove current from
dictionary
if v&ring_set[reference]:
ring_set.pop(k)
# If we are at the reference, then we have found our 'super'
ring
if k == reference:
result += 1
break
return result
and I passed in the aromaticRings list from your script, then returned both the
length of the aromaticRings list (as before) plus the output of GetOuterSet().
ie:
superRings = GetOuterSet(aromaticRings)
return len(aromaticRings), superRings
So once again, thanks for the help, and I would welcome any pointers from
anyone on tidying-up and improving this modification! (or corrections if
anyone spots them - I have only briefly tested this)
Kind regards
James
-Original Message-
From: Greg Landrum [mailto:greg.land...@gmail.com]
Sent: 11 June 2010 06:02
To: James Davidson
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Number of Aromatic Rings
Dear James,
On Thu, Jun 10, 2010 at 2:35 PM, James Davidson wrote:
>
> I have been trying figure-out how to return the count of aromatic
> rings for molecules (in Python), and am going to have to admit defeat!
> I saw in an earlier message
> (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg00
> 153.html) a similar query, but I'm afraid it didn't help me very much.
> I also read the section on Aromaticity in the rdkit book, and realised
> that maybe this isn't a trivial exercise!
Correct. Counting the number of non-fused rings that are aromatic, like the
post you reference does, is pretty easy; including the fused rings that are
aromatic is more challenging.
> I would like the count to count aromatic ring-systems such that
> bicyclic (eg indole or naphthalene) would only count as 1. For
> reference, this appears to be the behaviour of the OpenEye
> OEDetermineAromaticRingSystems function - where the molecule derived
> from the smiles "C(O)(=O)c12c1[nH]c(C3CCCc4c34)c2" (which
> contains an indole and a
> tetrahydronaphthalene) gives a count of 2.
>
> Any help would be greatly appreciated.
I've attached a script that's not quite what you want, but it gets you almost
there: it finds all aromatic ring systems, including fused ones. Anthracene,
for example, gives 6 rings. The modifications to this to get what you're
looking for aren't a straightforward post-processing step, but shouldn't be too
bad. If there's not enough here, let me know and I will take a look at adding
the extra code.
This code isn't perfectly polished and could certainly be faster, but it does
seem mostly functional.
-greg
__
PLEASE READ: This email is confidential and may be privileged. It is intended
for the named addressee(s) only and access to it by anyone else is
unauthorised. If you are not an addressee, any disclosure or copying of the
contents of this email or any action taken (or not taken) in reliance on it is
unauthorised and may be unlawful. If you have received this email in error,
please notify the sender or postmas...@vernalis.com. Email is not a secure
method of communication and the Company cannot accept responsibility for the
accuracy or completeness of this message or any attachment(s). Please check
this email for virus infection for which the Company accepts no responsibility.
If verification of this email is sought then please request a hard copy. Unless
otherwise stated, any views or