On Sun, Dec 11, 2011 at 6:42 PM, Andrew Dalke <da...@dalkescientific.com> wrote:
> On Dec 11, 2011, at 6:36 AM, Greg Landrum wrote:
>> Assuming I implemented your FPs correctly (code attached), this is what I 
>> get:
>  ...
>> pubchem pieces:
>> [05:30:38] INFO: FINISHED 50001 (41150823 total, 2948920 searched,
>> 964883 found) in 80.68
>> [05:30:38] INFO:   screenout: 0.07, accuracy: 0.33
>> #--------------------
>
> I got your code working against my copy of Zinc (110907 compounds), with the 
> fragments you pointed me to. On my computer I get:
>
> [18:14:37] INFO: FINISHED 50001 (41150823 total, 2647252 searched, 880194 
> found) in 100.49
> [18:14:37] INFO:   screenout: 0.06, accuracy: 0.33
>
> (I'm testing with a slightly older version of RDKit, which might explain the 
> differences in the searched/found numbers.)
>

More likely it's the different version of ZINC.

> I'm confused about how to compare this to the other numbers you reported. You 
> wrote about the new work you've been doing:
>
>> New fingerprint:
>> - Zinc fragments: 50 million pairs, 4413 hits, 0.9 seconds seconds
>> - Zinc leads: 50 million pairs, 1875 hits, 0.7 seconds
>> - pubchem pieces; 82.3 million pairs, 2.5 million hits, 166 seconds
>
>
> Does this mean the screenout for your new fingerprints is 2.5/82.3 = 0.03 for 
> the pubchem pieces? That means it's twice as good as a screenout of 0.06, 
> right? But then how did your new fingerprints give 166 seconds while the 
> fragments I sent yesterday run in 81 seconds on what I presume is the same 
> machine?

It's not comparing the same thing. The 166 second number is from a
query in postgresql across the full 100K compounds and the 81 second
number is from the python script I sent, which uses a 50K subset.
Sorry I wasn't clear about that.

The 2.5/82.3 number above is the hit rate of the query (the number of
results found). I don't know of any convenient way to directly measure
the screenout performance in the cartridge, so I can just report the
number of hits and the run time.

-greg

------------------------------------------------------------------------------
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to