Hi,
Here are the results from the shuffle (10x) test for the 5 million
compounds in the eMolecules database. In general the results are good
and only 33 canonicalization errors remain which should be easy to
fix.
Process stops: 3429680, 3429701, 3429702, 3429717, 3429742, 3429767,
3429887, ... (these are indexes (line number) in the
eMolecules-2010-03-01.smi file starting from 1)
3429680: [Li+251] 24639246
3429701: [ClH+276] 24639289
3429702: CCCC[n+251]1cccc(C)c1 24639291
...
I continued testing from 3500000. Any ideas on how to handle this?
Segfaults: 1278211, 1278212
S=C1NCCCCCCNC(=S)S[Fe]2SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S[Fe](SC(=S)NCCCCCCNC(=S)S[Ni]S1)SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S2
4315482
S=C1NCCCCCCNC(=S)S[Cr]2SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S[Cr](SC(=S)NCCCCCCNC(=S)S[Ni]S1)SC(=S)NCCCCCCNC(=S)S[Ni]SC(=S)NCCCCCCNC(=S)S2
4315484
These have large rings which are not found I think. We should be able
to correctly detect ring membership though since this is done using a
spanning tree before SSSR/LSSR analysis is done. I'll take a look at
this.
Canonicalization errors: 33
All errors are the same problem AFAIK. The canonical code does
consider the H atoms that are added when writing out the smiles. I can
add this to the canonical code but I'll probably copy some code for
this from the smiles format.
Cc1cccc(c1)C(=O)Nc1nnc[nH]1.Cc1cccc(c1)C(=O)Nc1n[nH]cn1 8622926
Cc1cccc(c1)C(=O)Nc1n[nH]cn1.Cc1cccc(c1)C(=O)Nc1nnc[nH]1 8622926
This is not an aromaticity error, the two fragments have identical
canonical code since there is no difference between n and [nH].
CC1=CC(=O)c2c(C1=O)c(O)ccc2O.CCC=C(C)C.OC.[CH].C 19231703
CC1=CC(=O)c2c(C1=O)c(O)ccc2O.CCC=C(C)C.OC.C.[CH] 19231703
C[CH] 23745856
[CH]C 23745856
C[CH2] 23745858
[CH2]C 23745858
O[O] 23903986
[O]O 23903986
C1CC[CH][CH]CCC1.C1CCCCC[CH][CH]1.C1[CH][CH]CCCCC1.C1CC[CH][CH]CCC1.[Ir]Cl.[Ir]Cl
23904497
[CH]1[CH]CCCCCC1.C1C[CH][CH]CCCC1.[CH]1CCCCCC[CH]1.[CH]1[CH]CCCCCC1.[Ir]Cl.[Ir]Cl
23904497
[CH]1[CH]CCCCCC1.C1C[CH][CH]CCCC1.C1CCC[CH][CH]CC1.[CH]1[CH]CCCCCC1.[Ir]Cl.[Ir]Cl
23904497
[CH]1CCCCCC[CH]1.C1CCC[CH][CH]CC1.[CH]1CCCCCC[CH]1.C1CCCC[CH][CH]C1.[Ir]Cl.[Ir]Cl
23904497
C[C]([CH2])[CH2].[CH2][C]([CH2])C.[Pd]Cl.[Pd]Cl 23906874
C[C]([CH2])[CH2].C[C]([CH2])[CH2].[Pd]Cl.[Pd]Cl 23906874
[CH2][C]([CH2])C.C[C]([CH2])[CH2].[Pd]Cl.[Pd]Cl 23906874
C[C]([CH2])[CH2].[CH2][C]([CH2])C.[Pd]Cl.[Pd]Cl 23906874
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
24631596
[CH]1[CH]CC[CH][CH]CC1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
24631596
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
24631596
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.c1ccc(cc1)P(c1ccccc1)c1ccccc1.ClCCl.[Rh]
24631596
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC 26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC 26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC 26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC 26965008
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)COc2ccccc2)cc(c1OC)OC 26965008
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C/2\C(=O)N=c3n(C2=N)c(cs3)c2ccccc2)cc(c1OC)OC 26965122
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)C(C)C)cc(c1OC)OC 26965176
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC 26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC 26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC 26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC 26965734
[CH]Oc1cc(/C=C\2/C(=O)N=c3n(C2=N)nc(s3)c2ccccc2C)cc(c1OC)OC 26965734
*.FC1(F)Oc2c(O1)cc(c(c2)[N])N 27518948
*.FC1(F)Oc2c(O1)cc(c(c2)N)[N] 27518948
*.FC1(F)Oc2c(O1)cc(c(c2)N)[N] 27518948
CN1CCCC1c1cccnc1.OOOOOO.[CH2]C#CC 27522714
CN1CCCC1c1cccnc1.OOOOOO.CC#C[CH2] 27522714
CCCCC[CH] 29331055
[CH]CCCCC 29331055
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe] 29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe] 29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe] 29370482
CO[C]1[CH]C[C]([CH][CH]1)C.[C-]#[OH2+].[C-]#[OH2+].[C-]#[OH2+].[Fe] 29370482
[CH]1C[CH][CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr] 29371034
[CH]1[CH]C[CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr] 29371034
C1[CH][CH][CH][CH][CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr] 29371034
[CH]1[CH][CH][CH]C[CH][CH]1.[OH2+]#[C-].[OH2+]#[C-].[OH2+]#[C-].[Cr] 29371034
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
O=C(Nc1nc[nH]n1)COc1ccc(c(c1)C)Br.O=C(Nc1[nH]cnn1)COc1ccc(c(c1)C)Br 29450609
[CH]1[CH]CC[CH][CH]CC1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]
29491188
C1C[CH][CH]CC[CH][CH]1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]
29491188
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]
29491188
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](p1c1ccccc1p...@h](C)c...@h]1c)C.[Rh]
29491188
C1[CH][CH]CC[CH][CH]C1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
29491195
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
29491195
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
29491195
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
29491195
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
29491197
[CH]1CC[CH][CH]CC[CH]1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
29491197
C1[CH][CH]CC[CH][CH]C1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
29491197
[CH]1[CH]CC[CH][CH]CC1.C[C@@h]1c...@h](P1C1=C(C(=O)OC1=O)p...@h](C)c...@h]1c)C.[Rh]
29491197
[CH]CCCCCCCCCCCCCCC 29536355
CCCCCCCCCCCCCCC[CH] 29536355
C1CCC[CH]1 29538372
[CH]1CCCC1 29538372
[CH]=C 29538463
C=[CH] 29538463
C1[CH]CCCC1 29538482
C1CCC[CH]C1 29538482
[CH]1CCCCC1 29538482
C1C[CH]CCC1 29538482
C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750
C/C(=C(/[CH2])\C)/[CH2] 29550750
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C 29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C 29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C 29934806
*.CCN(CCOC(=O)C1(CCCCC1)C1CCCCC1)[C]C 29934806
*.CCN(N(N=O)O)CC.CCC.CC[CH2] 29934822
*.CCN(N(N=O)O)CC.CC[CH2].CCC 29934822
*.CCN(N(N=O)O)CC.CCC.[CH2]CC 29934822
*.CCN(N(N=O)O)CC.CCC.CC[CH2] 29934822
[CH2][C]([CH][C]([CH2])C)C.C[C]([CH][C]([CH2])C)[CH2].[Ru] 30155022
C[C]([CH][C]([CH2])C)[CH2].[CH2][C]([CH][C]([CH2])C)C.[Ru] 30155022
C[C]([CH]CC[CH][C](C)[CH2])[CH2].[CH2][C]([CH]CC[CH][C](C)[CH2])C.Cl[Ru]Cl.Cl[Ru]Cl
30155024
[CH2][C]([CH]CC[CH][C](C)[CH2])C.C[C]([CH]CC[CH][C](C)[CH2])[CH2].Cl[Ru]Cl.Cl[Ru]Cl
30155024
O=CNc1c(C)cccc1C.CCCCN1[CH]CCCC1.CC 30155687
O=CNc1c(C)cccc1C.CCCCN1CCCC[CH]1.CC 30155687
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
30177469
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
30177469
C1C[CH][CH]CC[CH][CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
30177469
[CH]1CC[CH][CH]CC[CH]1.c1ccc(cc1)cn1...@h]([C@@H](C1)P(c1ccccc1)c1ccccc1)P(c1ccccc1)c1ccccc1.[Rh]
30177469
C1[CH][CH]CC[CH][CH]C1.Cl[Ru]Cl 30424431
[CH]1CC[CH][CH]CC[CH]1.Cl[Ru]Cl 30424431
C1[CH][CH]CC[CH][CH]C1.Cl[Ru]Cl 30424431
[CH]1[CH]CC[CH][CH]CC1.Cl[Ru]Cl 30424431
Tim
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
Spend less time writing and rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel