Hi Katrina,

I'm slightly unsure what "deprotection" you are trying to represent, but I 
think there are a couple of problems with the rsmarts...

      reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"

This is looking for an aromatic carbon with one hydrogen AND connected to a 
non-ring boron.  This pattern will never be found!
Also, you have a mapped atom on the reactant side, but no mapped atoms on the 
product side.

If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return 
the alcohols), then you should map the oxygen atom on the product side as well 
- something like:

      reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]"

If, instead, you are interested in the virtual reaction that removes boronates 
from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you 
should map the aryl carbon on both sides instead:

      reaction_smarts = "[c:1][B;R0](O)O>>[*:1]"

In either case you probably want to deduplicate products (the boronic acids and 
esters will match the pattern twice).

Kind regards

James
________________________________
From: Katrina Lexa <kl...@umich.edu>
Sent: 21 August 2023 06:03
To: RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
Subject: [Rdkit-discuss] rdDeprotect & DeprotectData

Hi All,

I don't know why I'm struggling so much with this, as it seems like it should 
be pretty straight forward. I'm trying to add some additional deprotection 
smirks to a data-cleaning python script and I'm not having success with the new 
reactions actually transforming my reactants to deprotected smiles. I have 
about 10 I'd like to add, so I know I could do it with simple reactions, but 
I'd rather figure out where I'm going wrong here.

My definition of deprotect data:
#deborylation
deprotection_class = "boron"
reaction_smarts =  "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]"
abbreviation = "BOO"
full_name = "deboron"
bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, 
abbreviation, full_name)
assert bdata.isValid()

I tried adding this line:
newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata)

but it seems to make no difference:
try:
                    #result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata])
                    result = rdDeprotect.Deprotect(dep_m,newDeprotect)


As an example, this is one of the smiles strings in the smiles file I'm reading 
in I would expect to deprotect"
Cc1cc(B(O)O)ccc1OC(C)C

Maybe I'm just awful at writing SMIRKS?


Thanks for the help here,

Katrina

________________________________

PLEASE READ - This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

Vernalis (R&D) Limited (no. 1985479)
Granta Park, Great Abington
Cambridge, CB21 6GB, United Kingdom
Tel: +44 (0)1223 895 555
________________________________
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to