On Tue, Mar 8, 2011 at 12:15 PM, Stiefl, Nikolaus <nikolaus.sti...@novartis.com> wrote: > For similarity searches (ie – I have a molecule, give me similar molecules > in my database I can look through or test) I would suggest to use the morgan > FP’s. This is what those FP’s are designed for more or less.
Nik has it completely right: the morgan fps have been designed for similarity searching; I would choose them over the RDKit fingerprint for that application. An added bonus is that the Morgan FPs are *much* faster to calculate than the RDKit fingerprint. The next question is which radius to use. The general answer to that one is either 2 or 3 (corresponding to ECFP4 and ECFP6, respectively). Using a radius of 3 gives FPs that are pretty specific, a radius of 2 is a bit fuzzier (but compounds with high similarity will still typically look alike). Best, -greg ------------------------------------------------------------------------------ What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss