Hi Katrina, I'm slightly unsure what "deprotection" you are trying to represent, but I think there are a couple of problems with the rsmarts...
reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" This is looking for an aromatic carbon with one hydrogen AND connected to a non-ring boron. This pattern will never be found! Also, you have a mapped atom on the reactant side, but no mapped atoms on the product side. If your reaction is aiming to hydrolyse non-cyclic boronic esters (and return the alcohols), then you should map the oxygen atom on the product side as well - something like: reaction_smarts = "c[B;R0](O)[O:1]>>[*:1]" If, instead, you are interested in the virtual reaction that removes boronates from aryl R-groups (perhaps to calculate R-group fingerprints, etc) - then you should map the aryl carbon on both sides instead: reaction_smarts = "[c:1][B;R0](O)O>>[*:1]" In either case you probably want to deduplicate products (the boronic acids and esters will match the pattern twice). Kind regards James ________________________________ From: Katrina Lexa <kl...@umich.edu> Sent: 21 August 2023 06:03 To: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Subject: [Rdkit-discuss] rdDeprotect & DeprotectData Hi All, I don't know why I'm struggling so much with this, as it seems like it should be pretty straight forward. I'm trying to add some additional deprotection smirks to a data-cleaning python script and I'm not having success with the new reactions actually transforming my reactants to deprotected smiles. I have about 10 I'd like to add, so I know I could do it with simple reactions, but I'd rather figure out where I'm going wrong here. My definition of deprotect data: #deborylation deprotection_class = "boron" reaction_smarts = "[c;H1]([B;R0](O)[O;R0:1])>>[c;H1]" abbreviation = "BOO" full_name = "deboron" bdata = rdDeprotect.DeprotectData(deprotection_class, reaction_smarts, abbreviation, full_name) assert bdata.isValid() I tried adding this line: newDeprotect = rdDeprotect.DeprotectDataVect().append(bdata) but it seems to make no difference: try: #result = rdDeprotect.Deprotect(dep_m,deprotections=[bdata]) result = rdDeprotect.Deprotect(dep_m,newDeprotect) As an example, this is one of the smiles strings in the smiles file I'm reading in I would expect to deprotect" Cc1cc(B(O)O)ccc1OC(C)C Maybe I'm just awful at writing SMIRKS? Thanks for the help here, Katrina ________________________________ PLEASE READ - This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. Vernalis (R&D) Limited (no. 1985479) Granta Park, Great Abington Cambridge, CB21 6GB, United Kingdom Tel: +44 (0)1223 895 555 ________________________________
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss