Hi Kevin, On Thu, Mar 15, 2012 at 7:12 PM, <[email protected]> wrote: > Conclusions: > > 1. I think the modified algorithm I originally attached will generate the > same results as the original CDK 1.4.2 algorithm but much faster for > molecules containing multiple, but mostly unconnected rings (unless I have > mucked up somewhere)
Agreed. I spot no new regressions, but then again, it was not well tested so far... I added a unit test for the SMILES you reported, but need a few more test cases to make sure things work... I created a git branch: https://github.com/egonw/cdk/tree/372-142-dbst > 2. The present algorithm's aromaticity test is not sufficient to demonstrate > a molecule has all the double bonds layed out correctly I have added in that git branch a unit test, but I am only getting one double bond... Kevin, what am I doing wrong? + @Test + public void testLargeRingSystem() throws Exception { + String smiles = "Oc1c(C2CC(Cc3ccccc23)c4ccc(cc4)c5ccccc5)c(=O)oc6ccccc16"; + SmilesParser smilesParser = new SmilesParser(DefaultChemObjectBuilder.getInstance()); + IMolecule molecule = smilesParser.parseSmiles(smiles); + + DeduceBondSystemTool dbst = new DeduceBondSystemTool(new AllRingsFinder()); + molecule = dbst.fixAromaticBondOrders(molecule); + Assert.assertNotNull(molecule); + + molecule = (IMolecule) AtomContainerManipulator.removeHydrogens(molecule); + Assert.assertEquals(34, molecule.getAtomCount()); + + // we should have 14 double bonds + int doubleBondCount = 0; + for (int i = 0; i < molecule.getBondCount(); i++) { + IBond bond = molecule.getBond(i); + if (bond.getOrder() == Order.DOUBLE) doubleBondCount++; + } + Assert.assertEquals(14, doubleBondCount); + } > 3. There's almost certainly a better, faster, more complete way of doing this > using all the extra atom type and bonding info available.. Yes, and we can still do that later... things come in steps :) Just wanted to bounce my ideas with you... (Thanx for your work on this!) > 4. ..but the idea of splitting up into groups of interconnected rings may be > a useful first step anyway Yes, I think it is... and not one we can not push to cdk-1.4.x... it would make the tool better... Egon -- Dr E.L. Willighagen Postdoctoral Researcher Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ------------------------------------------------------------------------------ This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

