Hmm, when I was in the drug discovery canal, the "descriptors" that you
could calculate from a SMILES string were legion.

Here's the list for RDKIT,
https://www.rdkit.org/docs/GettingStartedInPython.html#list-of-available-descriptors.
There are one bunch that depend entirely on the formula and molecular
structure.  Then there's a whole other bunch you can compute if you
generate 3d structures for the molecules, possibly multiplied by the number
of low energy structures the molecule can adopt.

What kind of plausibility were you looking for?  Does the SMILES string
specify a real molecule?  That's hard.  There are syntax errors in SMILES,
failures to close rings, valency errors, charge errors.  But there are lots
of syntactically valid SMILES that won't match any known molecule, either
because they're impossible or as yet to be determined.  The pharmas all
have their own lists of molecules of interest, but those are proprietary.
Looks like there are various online databases, none that I'm familiar
with.  If the SMILES parses, you can try generating a 2d depiction and a 3d
structure.  Those will throw exceptions if things get too weird.

-- rec --

On Tue, Oct 12, 2021 at 3:22 PM Marcus Daniels <mar...@snoutfarm.com> wrote:

> I was playing with RDKIT the other day, and it wasn’t obvious how to get a
> scalar quantity of plausibility of a molecule.   It seems a SMILES string
> is right or wrong, and then maybe there are some warnings that can be
> trapped.   However, the benefits for search or fair sampling are different
> than the needs of correctness checks, which is deeper property.   That
> isn’t quite a fit to the music example where aesthetic considerations are
> subjective.
>
>
>
> *From:* Friam <friam-boun...@redfish.com> *On Behalf Of *Jon Zingale
> *Sent:* Tuesday, October 12, 2021 12:11 PM
> *To:* friam@redfish.com
> *Subject:* Re: [FRIAM] Schwill Rock?
>
>
>
> "I mean from the perspective of aesthetics. Understanding why Pandora is
> messing it up means sampling the deep wells."
>
>
>
> Yes, but not more than one has to. This is why I am advocating for methods
> like a weighted ensemble. The working analogy for me comes from drug
> discovery. It doesn't make a lot of sense to probe the same old sites and
> the same old conformations.
>
> .-- .- -. - / .- -.-. - .. --- -. ..--.. / -.-. --- -. .--- ..- --. .- - .
> FRIAM Applied Complexity Group listserv
> Zoom Fridays 9:30a-12p Mtn UTC-6  bit.ly/virtualfriam
> un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:
>  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
>  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>
.-- .- -. - / .- -.-. - .. --- -. ..--.. / -.-. --- -. .--- ..- --. .- - .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn UTC-6  bit.ly/virtualfriam
un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:
 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
 1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Reply via email to