On Dec 11, 2011, at 6:36 AM, Greg Landrum wrote: > Assuming I implemented your FPs correctly (code attached), this is what I get: ... > pubchem pieces: > [05:30:38] INFO: FINISHED 50001 (41150823 total, 2948920 searched, > 964883 found) in 80.68 > [05:30:38] INFO: screenout: 0.07, accuracy: 0.33 > #--------------------
I got your code working against my copy of Zinc (110907 compounds), with the fragments you pointed me to. On my computer I get: [18:14:37] INFO: FINISHED 50001 (41150823 total, 2647252 searched, 880194 found) in 100.49 [18:14:37] INFO: screenout: 0.06, accuracy: 0.33 (I'm testing with a slightly older version of RDKit, which might explain the differences in the searched/found numbers.) I'm confused about how to compare this to the other numbers you reported. You wrote about the new work you've been doing: > New fingerprint: > - Zinc fragments: 50 million pairs, 4413 hits, 0.9 seconds seconds > - Zinc leads: 50 million pairs, 1875 hits, 0.7 seconds > - pubchem pieces; 82.3 million pairs, 2.5 million hits, 166 seconds Does this mean the screenout for your new fingerprints is 2.5/82.3 = 0.03 for the pubchem pieces? That means it's twice as good as a screenout of 0.06, right? But then how did your new fingerprints give 166 seconds while the fragments I sent yesterday run in 81 seconds on what I presume is the same machine? Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss