Hi all,


I am trying to do some substructure queries using the RDKit PostgreSQL
cartridge. Specifically, my queries substructure inputs are CTAB (not
SMARTS) so I would like to use qmol_from_ctab. However, I have some
problems with making valid query molecules with a few CTABs.



In this query, I try to use a CTAB to make a query to search for aryl
boronate acid/ester. I can make an equivalent query using SMARTS but the
CTAB is not valid.


As far as I am aware, there's no warning message when using the SQL
functions, so I use MolFromMolBlock from python and get "non-ring atom 0
marked aromatic" so I correct the aromatic bond type to double bond and the
CTAB can be read in (but that's not the query I want). I am guessing that
there may be additional validity checks / sanitization steps when executing
qmol_from_ctab vs qmol_from_smarts? As far as I can see, there’s no flag in
qmol_from_ctab.



I describe the general problem below but also attach the ipynb (if it is
useful) that uses psycopg2 to do the SQL , leaving out the database
connection credentials.



Many thanks,



Susan

__________________________________



For example, I want to match an aromatic boronic acid:

sm1 = 'OB(O)c1ccccc1'



But the following CTAB isn’t valid. MolFromFromBlock returns non-ring atom
marked aromatic error so I suspect it’s to do with that. Also changing the
bond marked aromatic ‘4’ to a double bond ‘2’ makes the ctab valid.

ctab_og = """Boronate acid/ester(aryl)

  SciTegic12012112112D



  5  4  0  0  0  0            999 V2000

    1.7243   -2.7324    0.0000 A   0  0

    2.7559   -2.1456    0.0000 C   0  0

    3.7808   -2.7324    0.0000 B   0  0

    4.8057   -2.1456    0.0000 O   0  0

    3.7808   -3.9190    0.0000 O   0  0

  1  2  4  0  0  1  0

  2  3  1  0

  3  4  1  0

  3  5  1  0

M  END

> <Name>

Boronate acid/ester(aryl)



"""

ctab_fixed = """Boronate acid/ester(aryl)

  SciTegic12012112112D



  5  4  0  0  0  0            999 V2000

    1.7243   -2.7324    0.0000 A   0  0

    2.7559   -2.1456    0.0000 C   0  0

    3.7808   -2.7324    0.0000 B   0  0

    4.8057   -2.1456    0.0000 O   0  0

    3.7808   -3.9190    0.0000 O   0  0

  1  2  2  0  0  1  0

  2  3  1  0

  3  4  1  0

  3  5  1  0

M  END

> <Name>

Boronate acid/ester(aryl)



"""

select is_valid_ctab('{ctab_og}')



Returns False

select is_valid_ctab('{ctab_fixed}')

Returns True



However, I can make a qmol using SMARTS match sm1. Is there of making the
query CTAB valid so we don’t have to use SMARTS?

select mol_from_smiles('{sm1}') @> qmol_from_ctab('{ctab_fixed}')



Returns False



select mol_from_smiles('{sm1}') @> qmol_from_smarts('{alt_smarts}')



Returns True

Attachment: 211209_problematic_ctab_example1_for_rdkit_mailing_list.ipynb
Description: application/ipynb

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to