Using the latest code from this morning (-r4157), I ran four processes all day 
long for a total of about 280,000 SMILES.  I found 12 more interesting cases 
where canonicalization failed.

Let me say again that this is very impressive, a huge reduction in problem 
cases!

These molecules seem to fall into new categories of problems, not necessarily 
the canonicalizer itself -- some seem to be problems with aromaticity and bond 
order or H count.

These were all discovered using my "shuffle" script that generates 20 random 
SMILES from each input SMILES using "babel -i smi -o smi -xC" then runs them 
through "babel -i smi -o can | sort -u".

Craig


http://www.emolecules.com/image?db=549&id=4842449&width=500&height=500
c12c(C(=O)C(=C(C1=O)s...@h]1c([...@h]([C@@H](CO1)OC(=O)C)OC(=O)C)OC(=O)C)s...@h]1[c@@H]([...@h]([C@@H](CO1)OC(=O)C)OC(=O)C)OC(=O)C)cccc2
        4842449
c12c(C(=O)C(=C(C1=O)s...@h]1[c@@H]([...@h]([C@@H](CO1)OC(=O)C)OC(=O)C)OC(=O)C)s...@h]1c([...@h]([C@@H](CO1)OC(=O)C)OC(=O)C)OC(=O)C)cccc2
        4842449

http://www.emolecules.com/image?db=549&id=4782286&width=500&height=500
[...@]12(CC(CN(C1)Cc1ccccc1)(CNC2)C)C   4782286
C12(c...@](CN(C1)Cc1ccccc1)(CNC2)C)C    4782286

http://www.emolecules.com/image?db=549&id=4785090&width=500&height=500
n12c(c3c(n4c(c5c1CCCC5)nc1c(c4=O)cccc1)cccc3)nc1c(c2=O)cccc1    4785090
n12c(=O)c3c(nc1c1c(n4c(c5c2cccc5)nc2c(c4=O)cccc2)cccc1)cccc3    4785090
n12c(=O)c3c(nc1c1c(n4c(c5c2cccc5)nc2c(c4=O)CCCC2)cccc1)cccc3    4785090

http://www.emolecules.com/image?db=549&id=5860502&width=500&height=500
c12=c3c(c4c(c5c(c1csc2)csc5)csc4)csc3   5860502
C12C(C3C(C4C(C5C1CSC5)CSC4)CSC3)CSC2    5860502

http://www.emolecules.com/image?db=549&id=5860663&width=500&height=500
c12c3c4c5c(ccc4c4c(c3ccc1cccc2)nc1c(n4)cc2c(c1)cccc2)cccc5      5860663
c12c3c4c(ccc3c3c(c2ccc2c1cccc2)nc1c(n3)cc2c(c1)CCCC2)cccc4      5860663
c12c3c(ccc2c2c(c4c1c1c(cc4)CCCC1)nc1c(n2)cc2c(c1)cccc2)cccc3    5860663

http://www.emolecules.com/image?db=549&id=5860665&width=500&height=500
c12c3c4c5c(ccc4c4c(c3ccc1cccc2)nc1c(n4)c2c(c3c1CCc1c3cccc1)c1c(CC2)cccc1)cccc5  
5860665
c12c3c4c(c5c(c3ccc1cccc2)nc1c(n5)c2c(c3c1ccc1c3cccc1)c1c(cc2)CCCC1)ccc1c4CCCC1  
5860665
c12c3c4c(ccc3c3c(c2ccc2c1cccc2)nc1c(n3)c2c(c3c1ccc1c3cccc1)c1c(cc2)cccc1)cccc4  
5860665
c12c3c4c(ccc3c3c(c2ccc2c1cccc2)nc1c(n3)c2c(c3c1ccc1c3CCCC1)c1c(cc2)cccc1)cccc4  
5860665
c12c3c4c(ccc3c3c(c2ccc2c1cccc2)nc1c(n3)c2c(c3c1ccc1c3CCCC1)c1c(cc2)CCCC1)cccc4  
5860665
c12c3c4c(ccc3c3c(c2ccc2c1cccc2)nc1c(n3)c2c(c3c1CCc1c3cccc1)c1c(cc2)cccc1)cccc4  
5860665
c12c3c(c4c(c5c3c3c(CC5)cccc3)nc3c(n4)c4c(c5c3ccc3c5cccc3)c3c(CC4)cccc3)ccc1cccc5860665
c12c3c(c4c(c5c3c3c(CC5)cccc3)nc3c(n4)c4c(c5c3CCc3c5cccc3)c3c(cc4)cccc3)ccc1cccc5860665
c12c3c(ccc1c1c(c4c2c2c(CC4)cccc2)nc2c(n1)c1c(c4c2ccc2c4CCCC2)c2c(cc1)CCCC2)cccc5860665
c12c3c(ccc2c2c(c4c1c1c(cc4)CCCC1)nc1c(n2)c2c(c4c1CCc1c4cccc1)c1c(cc2)cccc1)cccc5860665

http://www.emolecules.com/image?db=549&id=6137697&width=500&height=500
c12=c(nn2)ssnc1S        6137697
c12c(nn2)ssnc1S 6137697

http://www.emolecules.com/image?db=549&id=5863122&width=500&height=500
C1([N+](=O)[O-])C2C[C@@h]3...@h]1c[c@H](C2)C3   5863122
C1([N+](=O)[O-])[C@@H]2C[C@@h]3cc1...@h](C2)C3  5863122

http://www.emolecules.com/image?db=549&id=5865030&width=500&height=500
c12c3c4c5c6c7c8c(ccc7ccc6ccc5ccc4ccc3ccc1cccc2)cccc8    5865030
c12c3c(ccc2ccc2c1c1c4c5c6c(ccc5ccc4ccc1cc2)CCCC6)cccc3  5865030

http://www.emolecules.com/image?db=549&id=5865292&width=500&height=500
c12c3c4c5c6c7c8c9c(ccc8ccc7ccc6ccc5ccc4ccc3ccc1cccc2)cccc9      5865292
c12c3c4c5c6c7c8c9c(CCc8ccc7ccc6ccc5ccc4ccc3ccc1cccc2)cccc9      5865292
c12c3c4c(ccc3ccc2ccc2c1c1c3c5c6c(ccc5ccc3ccc1cc2)CCCC6)CCCC4    5865292
c12c3c(ccc2ccc2c1c1c4c5c6c7c(ccc6ccc5ccc4ccc1cc2)CCCC7)cccc3    5865292

http://www.emolecules.com/image?db=549&id=5865338&width=500&height=500
C(C(N(C)C)C)(c1ccccc1)o...@h]([C@@H](N(C)C)C)(c1ccccc1)O        5865338
[...@h]([C@@H](N(C)C)C)(c1ccccc1)O.C(C(N(C)C)C)(c1ccccc1)O      5865338

http://www.emolecules.com/image?db=549&id=5865516&width=500&height=500
c12c3c4c5c6c(c7c8c9c%10c(CCc9cc(c8ccc7cc6)Br)cccc%10)ccc5ccc4c(cc3ccc1cccc2)Br  
5865516
c12c3c4c5c(c6c7c8c9c(CCc8cc(c7ccc6cc5)Br)cccc9)ccc4ccc3c(cc2ccc2c1CCCC2)Br      
5865516
c12c3c(ccc2cc(c2c1c1c4c(c5c6C7c8c(CCC7CC(c6ccc5cc4)Br)cccc8)ccc1cc2)Br)cccc3    
5865516


------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to