Hi JeanPaul,

The difference between featmorganbv and morganbv is that the first one uses 
pharmacophore features for atom descriptions whereas the other one atom types 
(it essentially corresponds to the ECFP descriptors). I would suggest to use 
featmorganbv_fp only if you want to do more fuzzy similarity searching 
(scaffold-hopping and the like).

RDKit fingerprint is (as stated below) a daylight fingerprint like FP that is 
using hashed molecular subgraphs - it is ok depending on what you want to do 
with it - maybe you have a better in-house descriptor that  is optimized for 
substructure searching though.

Hope that helps
Nik


From: JP [mailto:jeanpaul.ebe...@inhibox.com]
Sent: Tuesday, March 08, 2011 11:05 AM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] Best practice: which (database) fingerprints to use ?


Hi there,

I am storing a ton of molecules (~8M - it would be a ton if you print them all 
out, and hance use all of the trees in Regent's Park) in a database and using 
fingerprints for substructure and similarity searches.  The fingerprints I am 
currently using are the ones I took blindly from the wikipages documentation 
(when in doubt, copy) - specifically torsionbv_fp, morganbv_fp and 
atompairbv_fp (from http://code.google.com/p/rdkit/wiki/DatabaseCreation2).

Now I look at the database cartridge documentation - 
http://code.google.com/p/rdkit/wiki/ReferenceDocumentation and I see there are 
others - some of which I have actually heard about:

featmorganbv_fp(mol,int) : returns a bfp which is the bit vector Morgan 
fingerprint for a molecule using chemical-feature invariants. The second 
argument provides the radius. This is an FCFP-like fingerprint.
rdkit_fp(mol) : returns a bfp which is the RDKit fingerprint for a molecule. 
This is a daylight-fingerprint using hashed molecular subgraphs.

What is the best practice here?  Is it to use rdkit_fp ? (I assume this was 
added later - and possibly the original documentation is out of date)
What is the difference between featmorganbv and the one I am using (i.e. 
morganbv_fp) ?
What do you suggest in your experience?
Any ideas will be highly appreciated - as right now I am quite without any 
myself.

Many Thanks
JP
------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to