BTW, I sent Geoff some new kekulize.cpp code this morning.  If anyone's 
interested, here are the comments that describe how it works.  Comments 
welcome...  Note that "SSSR" could/should be replaced by "LSSR" since LSSR is 
easier to compute for these very large molecules.  We've been using the new 
kekulize.cpp code for a while now and it seems to work.

Craig


----------------

Very large aromatic ring systems can be very computationally
difficult.  In theory, there are 2^N ways to try assigning
single/double to assign bonds (where N is the number of bonds in
the aromatic system).  For large molecules such as fullerenes,
this number (e.g. 2^60) exceeds all computing power in the world.

To avoid this, three strategies are used.

1. The algorithm proceeds by assigning bonds to each complete ring
    from the SSSR, rather than treating the whole aromatic system.
    That reduces the complexity from O(2^N) down to roughly O(2^R)
    where R is the number of rings in the SSSR.

2. Each time a ring is completed (a trial assignment of
    single/double), the whole tentative assignment is checked for
    sensibility.  If any atom has all of its bonds assigned, but
    still has a leftover electron, then the assignment fails before a
    lot more work is invested.  (Atoms connected to bonds that haven't
    been examined yet are ignored during this test.)

3. When one ring is finished, the next ring is selected by finding the
    one "most connected" to previously-considered rings.  Say, for
    example, we're working on a fullerene and have just finished one
    ring.  The second ring will be one that's adjacent to the first
    ring (one of its bonds is already assigned).  Now for the third
    ring, we can pick one that's in a "corner" of the first two, that
    is, a ring that has TWO bonds already assigned.

Strategy #3 means that the completed area of the ring system (the
bonds we've already assigned) will tend to have a "minimal perimeter",
and any atom that has one bond assigned will quickly have all of its
bonds assigned.  That means that strategy #2 can be used as early as
possible to detect bond assignments that can never work, long before
the entire aromatic system has been examined.  Strategy #1 means that
we can typically assign alternating single/double bonds correctly to
any particular ring on the first try.

Together, these three strategies together mean that even very large
aromatic systems are typically solved in milliseconds.



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to