Would be interesting to take a set of compounds and look at a correlation matrix - maybe one can identify a set of "generally" discriminating bits that can be used for screening ? Probably not but it could be worth a try ... then memory would go down as well as discriminating power up?
Nik Andrew Dalke <da...@dalkescientific.com> 12.02.2009 14:56 To RDKit Discuss <rdkit-discuss@lists.sourceforge.net> cc Subject Re: [Rdkit-discuss] Optimizing SSS in the RDKit On Feb 12, 2009, at 8:46 AM, Greg Landrum wrote: > I'm either not understanding completely or I disagree. The queries > were constructed by fragmenting the molecules I searched through, so > I'd expect lots of substructure hits (and a lower screen-out rate that > arbitrary queries against arbitrary molecules). Ahh, of course. But I don't think fingerprint screen give, say, 0.001% false rates. I think they are more in line with what you found. But if the bit distributions were really uncorrelated for molecules where one is not a substructure of the other, then I would expect extremely low false positive rates. 2048 bits should give a lot of discrimination power if the bits weren't correlated. > That's a good idea to add to the list of things to look into. It's > also relatively easy to do because it probably just involves > increasing the minimum path length included in fingerprints (at least > as a first step). Again, I don't have experience with that, but it means that there's less ability to handle unlikely atom types. Yes, the larger subgraphs will include them. Don't know. > Looking at MACCS is a good idea. I'll also put that on the list. Is this list on a wiki? ;) Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you.