Hi Webster,
That's a really good question.
At the moment there isn't any way to do SMARTS normalization. The
assumption throughout the code is that if you've gone to the trouble to
create a SMARTS then you captured the aromaticity that you intend to search
for there. I think your use case makes sense though, so this would be an
interesting thing for us to take a look at for a future release.
What you might be able to do in the meantime, and what I usually suggest
when coming from a chemical sketcher, is to get an MDL molfile from the
sketcher and then use that to do your queries. You can use mol_from_ctab()
in the cartridge along with mol_adjust_query_properties:
chembl_25=# select * from rdk.mols where
m@>mol_adjust_query_properties(mol_from_ctab('
Mrv1810 11021905152D
9 9 0 0 0 0 999 V2000
-2.2782 -0.0547 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.9927 -0.4672 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.9927 -1.2922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.2782 -1.7047 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.5637 -1.2922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.5637 -0.4672 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.2782 0.7703 0.0000 A 0 0 0 0 0 0 0 0 0 0 0 0
-0.8493 -0.0547 0.0000 A 0 0 0 0 0 0 0 0 0 0 0 0
-0.8493 -1.7047 0.0000 A 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 2 0 0 0 0
3 4 1 0 0 0 0
4 5 2 0 0 0 0
5 6 1 0 0 0 0
1 6 2 0 0 0 0
1 7 1 0 0 0 0
6 8 1 0 0 0 0
5 9 1 0 0 0 0
M END
')) limit 5;
The chemical sketchers that I have tried tend to do a better job of
generating queries in Mol files, and the RDKit deals with converting from
kekule->aromatic form for you.
Does that help?
-greg
On Thu, Oct 31, 2019 at 5:42 PM Webster Homer <
[email protected]> wrote:
> I am working on evaluating the RD Kit Postgresql data cartridge for use as
> the back end of a Web Application. The app will use a JavaScript sketcher
> to allow the user to input a SMILES of SMARTS that will be sent to the RD
> Kit cartridge. In evaluating RD Kit I found that it doesn’t support
> aromatic normalization on SMARTS. As a test case I used Marvin JS to
> generate a SMARTS: C(=CN=C1)C(=C1N2)N=C2
>
>
>
> Used it as a query:
>
> select structure_id from rdk.mols where m@
> >mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'));
>
> structure_id
>
> --------------
>
> (0 rows)
>
> Not surprisingly it had no hits. Looked at the mol_adjust_query_properties
> function:
>
> select
> mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'));
>
> mol_adjust_query_properties
>
> -----------------------------
>
> c1cc2ncnc2cn1
>
>
>
> That looked good.
>
> select structure_id from rdk.mols where m@
> >mol_adjust_query_properties(mol_from_smarts('c1cc2ncnc2cn1'));
>
> structure_id
>
> --------------
>
> 30183725
>
> (1 row)
>
> But wait there should be more hits!
>
> select count(*) from rdk.mols where m@>'c1cc2ncnc2cn1'::qmol;
>
> count
>
> -------
>
> 27
>
> Then I tried this:
>
> select structure_id from rdk.mols where m@
> >mol_adjust_query_properties(mol_from_smarts('c1cc2ncnc2cn1'),'{"adjustDegree":false}');
>
> (27 rows)
>
> OK, but what I really need to have work is this:
>
> select structure_id from rdk.mols where m@
> >mol_adjust_query_properties(mol_from_smarts('C(=CN=C1)C(=C1N2)N=C2'),'{"adjustDegree":false}');
>
> structure_id
>
> --------------
>
> (0 rows)
>
> Which it does not. Is mol_adjust_query_properties misnamed? It doesn’t
> really seem to want a query. Am I missing an option? Unless I can make this
> work I don’t see how I can use RD Kit in my application.
>
>
>
> What am I missing? Or does RD Kit just not allow for normalizing SMARTS?
>
>
>
> Thanks
>
> Webster Homer
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith. Click http://www.merckgroup.com/disclaimer to access the
> German, French, Spanish and Portuguese versions of this disclaimer.
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss