Hi there,
I am storing a ton of molecules (~8M - it would be a ton if you print them
all out, and hance use all of the trees in Regent's Park) in a database and
using fingerprints for substructure and similarity searches. The
fingerprints I am currently using are the ones I took blindly from the
wikipages documentation (when in doubt, copy) - specifically torsionbv_fp,
morganbv_fp and atompairbv_fp (from
http://code.google.com/p/rdkit/wiki/DatabaseCreation2).
Now I look at the database cartridge documentation -
http://code.google.com/p/rdkit/wiki/ReferenceDocumentation and I see there
are others - some of which I have actually heard about:
*featmorganbv_fp(mol,int) *: returns a bfp which is the bit vector Morgan
> fingerprint for a molecule using chemical-feature invariants. The second
> argument provides the radius. This is an FCFP-like fingerprint.
> *rdkit_fp(mol) *: returns a bfp which is the RDKit fingerprint for a
> molecule. This is a daylight-fingerprint using hashed molecular subgraphs.
What is the best practice here? Is it to use rdkit_fp ? (I assume this was
added later - and possibly the original documentation is out of date)
What is the difference between featmorganbv and the one I am using (i.e.
morganbv_fp) ?
What do you suggest in your experience?
Any ideas will be highly appreciated - as right now I am quite without any
myself.
Many Thanks
JP
------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss