Bugs item #3310783, was opened at 2011-06-02 19:42
Message generated for change (Tracker Item Submitted) made by baoilleach
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310783&group_id=40728
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Noel O'Boyle (baoilleach)
Assigned to: Nobody/Anonymous (nobody)
Summary: Aromatic P not recognised in SMILES
Initial Comment:
>From Andrew Dalke on list:
Perhaps I'm missing something after staring at fingerprint SMARTS definitions
for the last few days. I'm validating the MACCS substructure keys from RDKit,
which are also used in OpenBabel and CDK.
I'm writing a test suite, which will be public when done. (Actually, they are
public now, if you know where the version control repository is.)
I'm having a very difficult time generating an aromatic ring with a "P" in it
in OpenBabel.
>>> import pybel
>>> pybel.readstring("smi", "c1cccp1").write()
'C1CCCP1\t\n'
>>> pybel.readstring("smi", "c1ccccp1").write()
'C1=CC=NC=P1\t\n'
Since P is in the same group and has the same valence levels as N, I expected
the first of these to return "c1cccp1", similar to
>>> pybel.readstring("smi", "c1cccn1").write()
'c1ccc[nH]1\t\n'
Both RDKit and OEChem have no problem dealing with "c1cccp" and interpreting it
as an aromatic ring.
I processed about 50K structures from PubChem to find a number with aromatic
"p" in them. Since PubChem doesn't have aromaticity information, what I did was
use another program to perceive the aromaticity. Below I show the RDKit SMILES
for a structure and the OpenBabel equivalent for it.
You can see that of the 53 structures where RDKit has no problems with a "p" in
an aromatic ring, 51 of them are converted into aliphatic form by OpenBabel.
Is this due to a chemical reason or a design reason for why OpenBabel does
this? Perhaps it's something subtle about aromaticity perception (which I sadly
admit I still don't have a good grasp on).
This is with OEChem OBReleaseVersion() '2.3.0' which I built a couple of days
ago.
Andrew
[email protected]
Columns are
column 1: "p" in OpenBabel's SMILES
column 2: the SMILES string from RDKit
column 3: the SMILES string from OpenBabel
False 'CCc1c(CC)p(-c2ccccc2)c(-c2ccccc2)c1-c1ccccc1'
'CCC1C(CC)P(C2CCCCC2)C(C2CCCCC2)C1C1CCCCC1\t\n'
True
'[W].Cc1np(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1.[O+]#[C-].[C-]#[O+].[O+]#[C-].[C-]#[O+].[C-]#[O+]'
'[W].Cc1[nH]p(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1.[O+]#[C-].[C-]#[O+].[O+]#[C-].[C-]#[O+].[C-]#[O+]\t\n'
True 'Cc1np(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1'
'Cc1[nH]p(C([Si](C)(C)C)[Si](C)(C)C)nc1N1CCCCC1\t\n'
False
'c1ccc2c(c1)ccc1op(OC(C)CC(C)Op3oc4ccc5ccccc5c4c4c5ccccc5ccc4o3)oc3ccc4ccccc4c3c21'
'C1CCC2C(C1)CCC1OP(OC(C)CC(C)OP3OC4CCC5CCCCC5C4C4C5CCCCC5CCC4O3)OC3CCC4CCCCC4C3C21\t\n'
False 'Cc1cp(-c2ccccc2)c(Br)c1C' 'CC1CP(C2CCCCC2)C(Br)C1C\t\n'
False 'CCC(C)(C)c1c2c(pc(C(OC)=O)c1C(OC)=O)CCCCCC2'
'CCC(C)(C)C1=C2C(=PC(=C1C(=O)OC)C(=O)OC)CCCCCC2\t\n'
False
'[Zr+2].CCC(C)(C)[c-]1p2[c-](C(CC)(C)C)p12.[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1'
'[Zr+2].CCC(C)(C)[C-]1P2=P1[C-]2C(CC)(C)C.[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1\t\n'
False 'Cc1cccc2c1op(OC1COC3C(Op4oc5c(C)cccc5c5c(c(C)ccc5)o4)COC13)oc1c2cccc1C'
'CC1CCCC2C1OP(OC1COC3C(OP4OC5C(C)CCCC5C5C(C(C)CCC5)O4)COC13)OC1C2CCCC1C\t\n'
False 'c1cc2c(cc1)c(=O)o[p+](=O)o2' 'c1cc2c(cc1)C(=O)O[P+](=O)O2\t\n'
False 'c1ccc(-c2cc(-c3ccccn3)cpc2)nc1' 'c1ccc(C2=CC(=CP=C2)c2ccccn2)nc1\t\n'
False 'c1csc(-c2psc(-c3ccccc3)c2)c1' 'c1csc(C2=PSC(=C2)c2ccccc2)c1\t\n'
False 'CC(Np1oc2ccc3c(cccc3)c2c2c(o1)ccc1c2cccc1)c1ccccc1'
'CC(NP1OC2CCC3C(CCCC3)C2C2C(O1)CCC1C2CCCC1)C1CCCCC1\t\n'
False
'[Zr+2].[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1.C1C2CC3CC1CC([c-]1p4[c-](C56CC7CC(CC(C7)C5)C6)p14)(C2)C3'
'[Zr+2].[CH]1[CH][CH][CH][CH]1.[CH]1[CH][CH][CH][CH]1.C1C2CC3CC1CC([C-]1P4=P1[C-]4C14CC5CC(CC(C5)C1)C4)(C2)C3\t\n'
False 'c1ccc(P(C2C(Op3oc4ccc5c(cccc5)c4c4c(o3)ccc3c4cccc3)COC2)c2ccccc2)cc1'
'c1ccc(P(C2C(OP3OC4CCC5C(CCCC5)C4C4C(O3)CCC3C4CCCC3)COC2)C2CCCCC2)cc1\t\n'
False 'Cc1c(C)c(C)p(Cc2ccccc2Cp2c(C)c(C)c(C)c2C)c1C'
'CC1C(C)C(C)P(CC2CCCCC2CP2C(C)C(C)C(C)C2C)C1C\t\n'
False 'CCCN(C)p1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3'
'CCCN(C)P1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'c1ccc2c(c1)cc(C)c1op(NN3CCCCC3)oc3c(C)cc4ccccc4c3c21'
'C1CCC2C(C1)CC(C)C1OP(NN3CCCCC3)OC3C(C)CC4CCCCC4C3C21\t\n'
False 'CCOC(=O)C=C(C)Np1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3'
'CCOC(=O)C=C(C)NP1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'CCCCN(p1oc2ccc3c(c2c2c(o1)ccc1c2CCCC1)CCCC3)CCCC'
'CCCCN(P1OC2CCC3C(C2C2C(O1)CCC1C2CCCC1)CCCC3)CCCC\t\n'
False 'c1ccc2c(c1)cccc2CNp1oc2ccc3c(c2c2c(o1)ccc1c2CCCC1)CCCC3'
'c1ccc2c(c1)cccc2CNP1OC2CCC3C(C2C2C(O1)CCC1C2CCCC1)CCCC3\t\n'
False 'Cc1cc(C)c2op(N(C(C)c3ccccc3)C(C)c3ccccc3)oc3c(C)cc(C)cc3c2c1'
'CC1CC(C)C2OP(N(C(C)C3CCCCC3)C(C)C3CCCCC3)OC3C(C)CC(C)CC3C2C1\t\n'
False 'COc1cc(C)cc2c1op(N(C(C)c1ccccc1)C(C)c1ccccc1)oc1c(OC)cc(C)cc12'
'COC1CC(C)CC2C1OP(N(C(C)C1CCCCC1)C(C)C1CCCCC1)OC1C(OC)CC(C)CC21\t\n'
False
'Cc1cc(C)cc(P(CCOp2oc3c(C(C)(C)C)cc(C)c(C)c3c3c(C)c(C)cc(C(C)(C)C)c3o2)c2cc(C)cc(C)c2)c1'
'Cc1cc(C)cc(P(CCOP2OC3C(C(C)(C)C)CC(C)C(C)C3C3C(C)C(C)CC(C(C)(C)C)C3O2)C2CC(C)CC(C)C2)c1\t\n'
False
'CCN(CC)[p+]1c(P(=S)(c2ccccc2)c2ccccc2)c(-c2ccccc2)cc(-c2ccccc2)c1P(=S)(c1ccccc1)c1ccccc1'
'CCN(CC)[P+]1=C(P(=S)(c2ccccc2)c2ccccc2)C(=CC(=C1P(=S)(c1ccccc1)c1ccccc1)c1ccccc1)c1ccccc1\t\n'
False 'c1ccc(CCNp2oc3c(C)cc4ccccc4c3c3c(o2)c(C)cc2ccccc23)nc1'
'c1ccc(CCNP2OC3C(C)CC4CCCCC4C3C3C(O2)C(C)CC2CCCCC32)nc1\t\n'
False 'CN(C)p1n(S(C)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(C)(=O)=O'
'CN(C)P1N(S(=O)(=O)C)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)C\t\n'
False 'Cc1cc(C)c2op(N(C(C)c3ccccc3)C(C)c3ccccc3)oc3c(C)cc(C)c(C)c3c2c1C'
'CC1CC(C)C2OP(N(C(C)C3CCCCC3)C(C)C3CCCCC3)OC3C(C)CC(C)C(C)C3C2C1C\t\n'
False 'CC(=C)Cc1cccc2c1op(N(C(C)c1ccccc1)C(C)c1ccccc1)oc1c(CC(C)=C)cccc12'
'CC(=C)CC1CCCC2C1OP(N(C(C)C1CCCCC1)C(C)C1CCCCC1)OC1C(CC(=C)C)CCCC21\t\n'
False
'COC1COC(c2ccccc2)OC1C1OC(c2ccccc2)OCC1Op1oc2ccc3ccccc3c2c2c3ccccc3ccc2o1'
'COC1COC(c2ccccc2)OC1C1OC(c2ccccc2)OCC1OP1OC2CCC3CCCCC3C2C2C3CCCCC3CCC2O1\t\n'
False
'CC(N(p1n(S(C)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(C)(=O)=O)C(C)c1ccccc1)c1ccccc1'
'CC(N(P1N(S(=O)(=O)C)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)C)C(C)c1ccccc1)c1ccccc1\t\n'
False
'CC(C)N(C(C)C)p1n(S(c2ccc(C)cc2)(=O)=O)c2ccc3ccccc3c2c2c(ccc3ccccc23)n1S(c1ccc(C)cc1)(=O)=O'
'CC(C)N(C(C)C)P1N(S(=O)(=O)c2ccc(C)cc2)C2CCC3CCCCC3C2C2C(CCC3CCCCC23)N1S(=O)(=O)c1ccc(C)cc1\t\n'
False
'[Pd+2].[CH2][CH][CH2].FC(F)(F)S([O-])(=O)=O.c1ccc(P(COp2oc3ccc4c(cccc4)c3c3c(o2)ccc2c3cccc2)c2ccccc2)cc1'
'[Pd+2].[CH2][CH][CH2].FC(F)(F)S(=O)(=O)[O-].c1ccc(P(COP2OC3CCC4C(CCCC4)C3C3C(O2)CCC2C3CCCC2)C2CCCCC2)cc1\t\n'
False 'c1ccc(P(COp2oc3ccc4c(cccc4)c3c3c(o2)ccc2c3cccc2)c2ccccc2)cc1'
'c1ccc(P(COP2OC3CCC4C(CCCC4)C3C3C(O2)CCC2C3CCCC2)C2CCCCC2)cc1\t\n'
False 'c1ccc(C2C(Op3oc4ccccc4c4ccccc4o3)CCCC2)cc1'
'c1ccc(C2C(OP3OC4CCCCC4C4CCCCC4O3)CCCC2)cc1\t\n'
False 'CC(C)(C)Np1oc2ccc3c(c2c2c(ccc4c2CCCC4)o1)CCCC3'
'CC(C)(C)NP1OC2CCC3C(C2C2C(CCC4C2CCCC4)O1)CCCC3\t\n'
False 'COCCNp1oc2c(C)cc3ccccc3c2c2c(c(C)cc3ccccc32)o1'
'COCCNP1OC2C(C)CC3CCCCC3C2C2C(C(C)CC3CCCCC23)O1\t\n'
False
'[Li+].[W].Cc1c[p-]cc1C.[C-]#[O+].[O+]#[C-].[C-]#[O+].[O+]#[C-].[O+]#[C-]'
'[Li+].[W].CC1C[PH-]CC1C.[C-]#[O+].[O+]#[C-].[C-]#[O+].[O+]#[C-].[O+]#[C-]\t\n'
False 'COCC1N(p2oc3c(C)cc4ccccc4c3c3c(c(C)cc4ccccc43)o2)CCC1'
'COCC1N(P2OC3C(C)CC4CCCCC4C3C3C(C(C)CC4CCCCC34)O2)CCC1\t\n'
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=428740&aid=3310783&group_id=40728
------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Discover what all the cheering's about.
Get your free trial download today.
http://p.sf.net/sfu/quest-dev2dev2
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel