Thank You for your input and clarification!

Thanks,
Steven Pak

On Sun, Nov 15, 2020 at 2:18 PM Peter Gedeck <peter.ged...@gmail.com> wrote:

> The paper is pretty vague on implementation details. However, note that
> the code is copyright Novartis Institutes for BioMedical Research Inc. It
> was released in the public domain and at that point (2013) it was the
> implementation that was used internally at Novartis. You can therefore use
> the Python implementation in RDKit as the reference for this method. I
> would not spend any more time on finding the discrepancy.
>
> Best,
>
> Peter
>
>
> On Nov 15, 2020, at 11:01 AM, Gustavo Seabra <gustavo.sea...@gmail.com>
> wrote:
>
> So, basically,  your code perfectly reproduces RDKit's Python
> implementation.  However, those results (both yours and RDKit's) *do not*
> match the original paper.
>
> It foes look like a constant shift, but it is not: Some molecules have a
> different shift than others.
>
> Questions:
>
> 1. Are those the same molecules as in the original paper?
> 2. How well defined are the equations in the original paper?
>
> I'm guessing the RDKit's implementation is *not* 100% the same as in the
> original paper,  as is stated in the guthub page (
> https://github.com/rdkit/rdkit/blob/master/Contrib/SA_Score/sascorer.py)
>
> # several small modifications to the original paper are included
> # particularly slightly different formula for marocyclic penalty
> # and taking into account also molecule symmetry (fingerprint density)
>
>
> --
> Gustavo Seabra
> ------------------------------
> *From:* Steven Pak <steven....@stonybrook.edu>
> *Sent:* Saturday, November 14, 2020 12:20:47 PM
> *To:* Greg Landrum <greg.land...@gmail.com>
> *Cc:* rdkit-discuss@lists.sourceforge.net <
> rdkit-discuss@lists.sourceforge.net>
> *Subject:* Re: [Rdkit-discuss] Hello questions about the Synthetic
> Accessibility score
>
> Blue dots are RDKit-based python code vs My CPP implementation code.
> Orange dots are My CPP implementation code vs scores extracted from the
> original paper ( Estimation of synthetic accessibility score of drug-like
> molecules based on molecular complexity and fragment contributions). My
> CPP implementation of the SA_score is based on the python version of RDKIT.
> I am trying to match the values exactly the same as the RDKit version
> (which appears to be working). That is why I am a bit confused about why
> the orange dots appear to shift at a constant value. I am wondering as to
> why it shifts like that.
>
> As for the open source comment, I will let you know. I also did the same
> thing for QED scoring functions, and I have a couple of questions about
> that too, which I will send an email soon. I must talk to my team about
> this before we could step forward.
>
> Thanks!
>
> On Sat, Nov 14, 2020 at 2:29 AM Greg Landrum <greg.land...@gmail.com>
> wrote:
>
> Steven,
>
> Wow cool! Any thoughts about making that implementation open source?
>
> Did you recalculate the Python SA score with the same version of the RDKit
> you used for the CPP version? Did you do your implementation based on the
> Python code (hopefully) or the algorithm description in the paper?
>
> If the answer to both those questionsthat is “yes”, then I’m going to
> guess we’d need to see the code to diagnose the problem
>
> Best,
> -greg
>
> On Sat, 14 Nov 2020 at 00:06, Steven Pak <steven....@stonybrook.edu>
> wrote:
>
> Hello.
>
> I have been working on a CPP version of SA score. Results are fantastic!
> <image.png>
> As you can see in the image, the blue dots represent the SA_scores from
> python vs scores from my CPP version. The scores are perfectly in line with
> each other, which is great! However, for the orange dots, these are the
> values from RDKit vs original paper's. These are the original 40 compounds
> that I found in the original paper. I was just wondering why do the orange
> dots seem to have a constant shift throughout the graph? What part of the
> code was changed to have caused this? I am just curious.
>
> Thank you,
> --
> Steven Pak Pharm.D
> Ph.D Student | Rizzo Lab
> Stony Brook University (SUNY)
> Department of Pharmacological Sciences
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> --
> Steven Pak Pharm.D
> Ph.D Student | Rizzo Lab
> Stony Brook University (SUNY)
> Department of Pharmacological Sciences
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>

-- 
Steven Pak Pharm.D
Ph.D Student | Rizzo Lab
Stony Brook University (SUNY)
Department of Pharmacological Sciences
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to