On 08/16/2017 03:36 PM, Greg Landrum wrote:
Hi Shuai,

The RDKit Mol2 parser is really only validated for the atom types generated by corina. I'm not surprised that the ouput from open babel would not be understood. This is documented:
http://rdkit.org/docs/api/rdkit.Chem.rdmolfiles-module.html#MolFromMol2File

It would be really nice if open babel MOL2 output could directly be read
in by rdkit.

I often find myself running
$ obabel in.mol2 -O out.sdf
just for that purpose.

An aside: If you have an SDF file you can read that directly into the RDKit. It seems like you shouldn't need the openbabel translation step at all.

-greg


On Wed, Aug 16, 2017 at 12:13 AM, David Liu <sdhsm1...@gmail.com <mailto:sdhsm1...@gmail.com>> wrote:

    Dear all,

    I have troubles to kekulize molecule using rdkit, below is an example:

    The example.mol2 file looks like below:

    @MOLECULE
    example
    46 49 0 0 0
    SMALL
    GASTEIGER

    @ATOM
    1 C -4.5556 -0.2844 1.1718 C.3 1 LIG1 -0.0109
    2 C -6.0291 -0.7271 1.2334 C.3 1 LIG1 0.0493
    3 C -6.4413 -0.5958 -1.0493 C.3 1 LIG1 0.0493
    4 C -5.1977 0.3130 -1.1927 C.3 1 LIG1 -0.0109
    5 C 5.5992 -2.5640 -0.8780 C.ar 1 LIG1 -0.0253
    6 O -6.3822 -1.4588 0.0764 O.3 1 LIG1 -0.3796
    7 C 2.8943 1.6722 0.9911 C.ar 1 LIG1 0.2664
    8 C 5.1745 -2.0407 0.3480 C.ar 1 LIG1 0.1371
    9 C -1.6179 0.4017 0.1577 C.ar 1 LIG1 0.2173
    10 C -4.0573 -0.1702 -0.2838 C.3 1 LIG1 0.0275
    11 C 0.8767 -0.2307 1.1489 C.ar 1 LIG1 0.0370
    12 C 2.1438 -0.5325 1.6439 C.ar 1 LIG1 -0.0306
    13 C 6.1958 -1.7294 -1.8279 C.ar 1 LIG1 -0.0590
    14 C 6.3717 -0.3702 -1.5525 C.ar 1 LIG1 -0.0605
    15 C 5.9487 0.1564 -0.3282 C.ar 1 LIG1 -0.0452
    16 C 0.6358 1.0320 0.5744 C.ar 1 LIG1 0.1483
    17 C -0.1716 -1.1537 1.2042 C.ar 1 LIG1 0.0418
    18 C 3.1618 0.4153 1.5592 C.ar 1 LIG1 0.0780
    19 C 5.3424 -0.6749 0.6231 C.ar 1 LIG1 0.0480
    20 C 1.3530 3.2786 -0.1013 C.3 1 LIG1 0.0167
    21 F 4.6032 -2.8623 1.2640 F 1 LIG1 -0.2043
    22 S 4.7969 0.0115 2.1898 S.3 1 LIG1 -0.0812
    23 N -1.3906 -0.8211 0.7091 N.ar 1 LIG1 -0.2222
    24 O 3.8206 2.5277 0.9363 O.2 1 LIG1 -0.2664
    25 N 1.6412 1.9659 0.5033 N.ar 1 LIG1 -0.2949
    26 N -0.6088 1.3106 0.0937 N.ar 1 LIG1 -0.1964
    27 N -2.9091 0.7394 -0.3655 N.pl3 1 LIG1 -0.3104
    28 H -3.9262 -1.0225 1.7144 H 1 LIG1 0.0305
    29 H -4.4544 0.6942 1.6907 H 1 LIG1 0.0305
    30 H -6.1785 -1.3738 2.1237 H 1 LIG1 0.0560
    31 H -6.6965 0.1565 1.3647 H 1 LIG1 0.0560
    32 H -7.3658 0.0220 -1.0063 H 1 LIG1 0.0560
    33 H -6.5227 -1.2302 -1.9574 H 1 LIG1 0.0560
    34 H -4.8575 0.3261 -2.2513 H 1 LIG1 0.0305
    35 H -5.4753 1.3532 -0.9112 H 1 LIG1 0.0305
    36 H 5.4676 -3.6168 -1.0922 H 1 LIG1 0.0646
    37 H -3.7461 -1.1771 -0.6436 H 1 LIG1 0.0500
    38 H 2.3428 -1.4998 2.0895 H 1 LIG1 0.0638
    39 H 6.5237 -2.1362 -2.7758 H 1 LIG1 0.0618
    40 H 6.8363 0.2748 -2.2870 H 1 LIG1 0.0618
    41 H 6.0904 1.2094 -0.1219 H 1 LIG1 0.0630
    42 H -0.0243 -2.1352 1.6372 H 1 LIG1 0.0838
    43 H 2.2342 3.9528 -0.1073 H 1 LIG1 0.0457
    44 H 0.5450 3.7853 0.4685 H 1 LIG1 0.0457
    45 H 1.0258 3.1432 -1.1544 H 1 LIG1 0.0457
    46 H -3.0166 1.6655 -0.8392 H 1 LIG1 0.1492
    @BOND
    1 1 2 1
    2 1 10 1
    3 2 6 1
    4 3 4 1
    5 3 6 1
    6 4 10 1
    7 5 8 ar
    8 5 13 ar
    9 7 18 ar
    10 7 24 2
    11 7 25 ar
    12 8 19 ar
    13 8 21 1
    14 9 23 ar
    15 9 26 ar
    16 9 27 1
    17 10 27 1
    18 11 12 ar
    19 11 16 ar
    20 11 17 ar
    21 12 18 ar
    22 13 14 ar
    23 14 15 ar
    24 15 19 ar
    25 16 25 ar
    26 16 26 ar
    27 17 23 ar
    28 18 22 1
    29 19 22 1
    30 20 25 1
    31 1 28 1
    32 1 29 1
    33 2 30 1
    34 2 31 1
    35 3 32 1
    36 3 33 1
    37 4 34 1
    38 4 35 1
    39 5 36 1
    40 10 37 1
    41 12 38 1
    42 13 39 1
    43 14 40 1
    44 15 41 1
    45 17 42 1
    46 20 43 1
    47 20 44 1
    48 20 45 1
    49 27 46 1

    And the example.py code looks like

    from rdkit.Chem import AllChem
    from rdkit import Chem

    rdkit_mol = Chem.MolFromMol2File("example.mol2", sanitize=False,
    removeHs=False)
    mol = AllChem.RemoveHs(rdkit_mol)

    If running the example.py, it returns an error as below:

    ValueError: Sanitization error: Can't kekulize mol. Unkekulized
    atoms: 8 10 11 15 16 17 22 24 25

    It seems rdkit cannot understand the molecules when it try to remove
    the hydrogens, probably related to the format of the mol2 file I
    used here? I use openbabel to convert the mol2 file from an sdf
    file. So I wonder if there is a plan to parse the mol2 file like
    this or I need to further cook the mol2 file. I appreciate for any
    advices!


    Thanks,

    Shuai


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    Rdkit-discuss mailing list
    Rdkit-discuss@lists.sourceforge.net
    <mailto:Rdkit-discuss@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
    <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to