On Sun, Dec 11, 2011 at 6:42 PM, Andrew Dalke <da...@dalkescientific.com> wrote: > On Dec 11, 2011, at 6:36 AM, Greg Landrum wrote: >> Assuming I implemented your FPs correctly (code attached), this is what I >> get: > ... >> pubchem pieces: >> [05:30:38] INFO: FINISHED 50001 (41150823 total, 2948920 searched, >> 964883 found) in 80.68 >> [05:30:38] INFO: screenout: 0.07, accuracy: 0.33 >> #-------------------- > > I got your code working against my copy of Zinc (110907 compounds), with the > fragments you pointed me to. On my computer I get: > > [18:14:37] INFO: FINISHED 50001 (41150823 total, 2647252 searched, 880194 > found) in 100.49 > [18:14:37] INFO: screenout: 0.06, accuracy: 0.33 > > (I'm testing with a slightly older version of RDKit, which might explain the > differences in the searched/found numbers.) >
More likely it's the different version of ZINC. > I'm confused about how to compare this to the other numbers you reported. You > wrote about the new work you've been doing: > >> New fingerprint: >> - Zinc fragments: 50 million pairs, 4413 hits, 0.9 seconds seconds >> - Zinc leads: 50 million pairs, 1875 hits, 0.7 seconds >> - pubchem pieces; 82.3 million pairs, 2.5 million hits, 166 seconds > > > Does this mean the screenout for your new fingerprints is 2.5/82.3 = 0.03 for > the pubchem pieces? That means it's twice as good as a screenout of 0.06, > right? But then how did your new fingerprints give 166 seconds while the > fragments I sent yesterday run in 81 seconds on what I presume is the same > machine? It's not comparing the same thing. The 166 second number is from a query in postgresql across the full 100K compounds and the 81 second number is from the python script I sent, which uses a 50K subset. Sorry I wasn't clear about that. The 2.5/82.3 number above is the hit rate of the query (the number of results found). I don't know of any convenient way to directly measure the screenout performance in the cartridge, so I can just report the number of hits and the run time. -greg ------------------------------------------------------------------------------ Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss