On Mon, Jun 21, 2010 at 6:43 PM, Craig A. James <cja...@emolecules.com> wrote:
> The recent discussion of SSSR bugs prompted me to dig back through my emails 
> to one I wrote on 20 November 2007 to the BlueObelisk mailing list.  Here it 
> is in its entirety.
>
> Craig
>
> --------------------------------
>
>
> Andrew suggested an algorithmic description of aromaticity.  I think this is 
> a really good idea.
>
> I propose, for argument's sake, that we should be using a LSSR, "Largest Set 
> of Smallest Rings."  It would go like this: A breadth-first search for rings, 
> which terminates when all ring atoms have been included in at least one ring, 
> but only after ALL cyclic paths of a given size have been enumerated.  Since 
> the LSSR is computed specifically for aromaticity detection, it would stop at 
> some reasonable size, say 8-atom rings.
>
> For example, cubane's LSSR would have six (not five or four) 4-atom rings, 
> because when you get the the four-atom-ring step of the algorithm, you don't 
> stop until you've added all four-atom cyclic paths.  Similarly, the LSSR for 
> a prism (three square faces joined in a triangle) would have five rings (two 
> 3-atom rings and three 4-atom rings, whereas the SSSR for that structure 
> would only have four rings).

Implementing this shouldn't be to hard. Changing the last loop in
OBRingSearch should be all. It currently checks if a ring is contained
in the union of all other rings. Checking with a single ring should
give the behavior you describe (assuming we don't check frj).  The
SSSR should still be available though. Is there a reference for the
LSSR or is it a suggestion for a name? If there is no reference there
might be some unforeseen cases. For example, in cubane there are 3
6-atom-rings not fully overlapping with a single 4-atom-ring. How do
we decide if the ring should not go in the set? (note: if you check
all 4-rings, the 6th 4-ring wouldn't be found) This may not be
important for aromaticity though.

> Unlike SSSR, this is a very easy algorithm to code, and easy to explain. It's 
> computation time is O(R) where R is the number of rings, and there is no 
> arbitrariness to the LSSR; no matter how you compute it, you always get the 
> same answer.
>
> For most "ordinary" molecules, the LSSR is the same as the SSSR.  The LSSR 
> and SSSR only differ with "cage" structures, where the rings themselves form 
> rings, i.e. when there exists a set of R rings that "covers" all atoms, but R 
> is fewer than the number of bonds that must be broken to make the structure 
> acyclic.  And in these cases, the choice of rings for the SSSR is arbitrary, 
> whereas the LSSR is not.
>
> For aromaticity detection, this gives you an algorithmic description of rings 
> that is very suitable for the aromaticity definition.  It would go something 
> like this:
>
>  Compute the LSSR
>  Sort the LSSR list smallest-to-largest
>  for each atom in the molecule that isn't marked "aromatic" yet {
>    for each ring in the LSSR (ordered smallest to biggest) that atom is in {
>      if that ring is aromatic {
>        mark all of its atoms "aromatic"
>      }
>      else {
>       for each fused neighbor ring in the LSSR {
>         if the two rings together are aromatic {
>           mark all atoms in the two rings aromatic
>         }
>         else {
>           for each ring of the LSSR fused to either of the pair {
>             if the three rings together are aromatic {
>               mark all atoms in all three rings aromatic
>             }
>           }
>         }
>       }
>     }
>   }
>
>
> I think that's about it.  If I understand the chemistry, you never have to go 
> to four rings (more than 14 atoms) to discover aromaticity.  Or we could just 
> decide that was the OpenSMILES rule, even if there are obscure, rare cases 
> for which it isn't true.
>
> In extreme cases (fullerenes), this algorithm is still fast, O(nrings), 
> because you stop at three rings.  Something like 760 steps per ring in a 
> fullerene, but it's still linear, not polynomial or exponential.  And it's 
> only something like 760 steps if the thing turns out to be completely 
> non-aromatic.  In a real fullerene, it would decide all the atoms were 
> aromatic in the first pass of the outer loop, and be finished.
>
> Craig
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-Devel mailing list
> OpenBabel-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to