Hi Greg

Thanks for your quick answer. What I am doing is essentially the following:

from rdkit.Chem import MolStandardize
my_standardizer = MolStandardize.standardize.Standardizer()
standard_tautomer = my_standardizer.tautomer_parent(input_mol)


I assume that at the stage I construct my_standardizer  there would be some 
opportunity slip in an alternative configuration info

By the way, I think also that one of the two cases of vanishing 
stereo-chemistry reported in https://github.com/rdkit/rdkit/issues/2363
is caused by an overly eager keto/enol tautomerizer.

Best regards

Ansgar


Ansgar Schuffenhauer
Senior Investigator I
T +41 79 608 9063
ansgar.schuffenha...@novartis.com<mailto:ansgar.schuffenha...@novartis.com>

Novartis Pharma AG
NIBR

From: Greg Landrum <greg.land...@gmail.com>
Sent: Montag, 22. Juli 2019 17:42
To: Schuffenhauer, Ansgar <ansgar.schuffenha...@novartis.com>
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Rdkit-discuss Digest, Vol 141, Issue 16

Hi Ansgar,

It is possible to specify the tautomer parameter file that is used, but in 
order for me to explain how, I need to know how you are currently using the 
code to enumerate tautomers (i.e. which function you are calling).

As for the format: it's tab-delimited and the first entry is the name. The 
"r/f" flag is an indicator of which direction the transform is going that is 
just there to make the name unique.
In the SMARTS the first atom is the one with the mobile H and the last atom is 
where it should be moved to.

-greg



On Mon, Jul 22, 2019 at 3:08 PM Schuffenhauer, Ansgar 
<ansgar.schuffenha...@novartis.com<mailto:ansgar.schuffenha...@novartis.com>> 
wrote:
Dear all

For the standardizer module (Chem.MolStandardize), what is the best way to 
change some of the tautomerizer rules?
There is a data file in share/RDKit/Data/Molstandardize/tautomerTransforms.in 
which I assume to define the default.

//      Name    SMARTS  Bonds   Charges
1,3 (thio)keto/enol f   [CX4!H0]-[C]=[O,S,Se,Te;X1]
1,3 (thio)keto/enol r   [O,S,Se,Te;X2!H0]-[C]=[C]
1,5 (thio)keto/enol f   [CX4,NX3;!H0]-[C]=[C][CH0]=[O,S,Se,Te;X1]
1,5 (thio)keto/enol r   [O,S,Se,Te;X2!H0]-[CH0]=[C]-[C]=[C,N]
...

Now my questions are
1. What is the Syntax of this file? What does the "f" and the "r" stand for? Do 
the smarts have to start with the atom carrying the mobile H?
2. How can I instruct rdkit not to use this default file, but the one supplied 
by the user.

The background for this question that the smarts for keto/enol seems to be a 
bit too generic, as it catches also the alpha C-atoms of carboxylic acids and 
amides. Generation of tautomers here leads to a epimerization of stereo-centers 
in alpha positions of carboxylic acids and amides. That appears odd to me, as 
such stereo-centers are quite stable (in contrast to those of "real" ketones 
and aldehydes).


Best regards

Ansgar

Ansgar Schuffenhauer
Senior Investigator I
T +41 79 608 9063
ansgar.schuffenha...@novartis.com<mailto:ansgar.schuffenha...@novartis.com>

Novartis Pharma AG
NIBR



_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_rdkit-2Ddiscuss&d=DwMFaQ&c=ZbgFmJjg4pdtrnL2HUJUDw&r=5QXEEnQo9VkJH7cIXFb_E4UmFhbbILws-P-WlR4_pzpv_6dQk_-xFQGH00p03i-I&m=uiXOLxD_7MgeeA9MyeUBlDB3ufzf53oBws3smVh4cc8&s=L4Bzk6_VPaAqyj_iM8_9rz9diujKH9rSgsNrvBa5958&e=>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to