One approach could be to assign scoring functions for bond and atom matches (such as what OE supports <https://docs.eyesopen.com/toolkits/python/oechemtk/patternmatch.html#mcs-scoring-functions> )
On Fri, Nov 20, 2020 at 9:58 AM Gustavo Seabra <gustavo.sea...@gmail.com> wrote: > Hi Adelene, > > Doesn't the substructure match only works for the whole substructure, as > an all-or-nothing? > > I suppose I could use the MCSS and count the number of matching atoms, > then calculate the percentage match myself. > > Is it possible to get a partial match with substructure search? > > Gustavo. > > -- > Gustavo Seabra > > ------------------------------ > *From:* Adelene LAI <adelene....@uni.lu> > *Sent:* Friday, November 20, 2020 9:13:15 AM > *To:* Dan Nealschneider <dan.nealschnei...@schrodinger.com>; Gustavo > Seabra <gustavo.sea...@gmail.com> > *Cc:* RDKit Discuss <rdkit-discuss@lists.sourceforge.net> > *Subject:* Re: [Rdkit-discuss] Partial substructure match? > > > Hi Dan and Gustavo, > > > MCSS sounds good, but depends on the goal. > > > From the way Gustavo wrote, it sounds like a Query-Target substructure > search - he has a list of targets and one specific query, and he wants to > compare matching rate amongst the members of the list. > > > If so, I would try query SMARTS. > > > https://www.rdkit.org/docs/GettingStartedInPython.html#substructure-searching > > > Regarding the % substructure match, interesting question. How would you > quantify that? Not sure such a thing exists in RDKit right now. > > > Adelene > > > Doctoral Researcher > > Environmental Cheminformatics > > UNIVERSITÉ DU LUXEMBOURG > > > Campus Belval | Luxembourg Centre for Systems Biomedicine > > 6, avenue du Swing, L-4367 Belvaux > > T +356 46 66 44 67 18 > > [image: github.png] adelenelai > > > > > > > > > > > ------------------------------ > *From:* Dan Nealschneider <dan.nealschnei...@schrodinger.com> > *Sent:* Thursday, November 19, 2020 6:01:37 PM > *To:* Gustavo Seabra > *Cc:* RDKit Discuss > *Subject:* Re: [Rdkit-discuss] Partial substructure match? > > Gustavo - > That sounds like the "maximum common substructure" problem. Here's the > relevant section in RDKit's "Getting started in Python" > > > https://www.rdkit.org/docs/GettingStartedInPython.html#maximum-common-substructure > > > *dan nealschneider* | lead developer > [image: Schrodinger Logo] <https://www.schrodinger.com/> > > > On Thu, Nov 19, 2020 at 8:50 AM Gustavo Seabra <gustavo.sea...@gmail.com> > wrote: > > Hi all, > > Is it possible to search for *partial* substructure matches using RDKit? > > I'm aware of "HasSubstructMatch/ GetSubstructMatch", but my impression is > that it only returns full matches (100%) of the required pattern in a > structure. > > However, what I'd like to do is a bit different: Imagine I have one > specific > substructure (scaffold), and I'd like to search for molecules that have the > full substructure *or part of it*, and maybe get the percentage of the > substructure match? (100% = the full substructure is contained in the > molecule). For example, if the pattern is a naphthalene and the molecule to > search has a benzene, that would count as a 60% match. > > Is there a way to do that in RDKit? > > Thanks a lot! > -- > Gustavo Seabra > > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Rajarshi Guha | http://blog.rguha.net | @rguha <https://twitter.com/rguha>
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss