Just for your information on the BCUT descriptor

On my windowsXP Intel 2.4 GHz, 1GB RAM, JDK 1.6)

50 structures 
was 1 min 15 sec 
Now 12 sec

5059 structures 
was 1h 32m 44s 
Now 16m 38s

So big improvement. 

Thanks again.

> -----Original Message-----
> From: Rajarshi Guha [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, February 19, 2008 19:45
> To: Peter Maas
> Cc: CDK users list
> Subject: Re: [Cdk-user] a very strange problem wrt argument 
> passing and memory locations
> 
> 
> On Feb 19, 2008, at 12:03 PM, Peter Maas wrote:
> 
> > Well I copied most of your examples.
> > Please find it enclosed.
> > I'm running it against our  10 mg stock (>250k structures).
> 
> Hmm, I ran it on OS X (2.2 GHz, 1GB RAM, JDK 1.5) and it processed
> 277 relatively small molecules in 7 sec.
> 
> I did some rough testing of my own and I used a molecule from 
> PubChem (CID = 52) which has 55 heavy atoms. It turns out 
> that the polarizability code takes nearly a minute to run.
> 
> I tweaked the polarizability calculation so that now it takes 
> 4sec to run, bringing the processing time for this molecule 
> down to 4.9s.  
> Also, the 277 SDF file I mentioned above now takes 3.1s
> 
> However it does still slow down for some large molecules 
> (such as Pubchem CID 182) and I suspect that path length 
> calculation could be improved. I'll look at that in a few 
> days. Are your molecules very large?
> 
> In any case, the latest improvements are in SVN, so you 
> should sync and recompile. Things should go faster.
> 
> > I like to give R a shot clustering it but I'm afraid R also 
> will not 
> > be up to it.
> 
> Well creating a 250K x 250K distance matrix will bring most 
> machines to their knees, unless you have a very large amount 
> of RAM. But you could look at methods like spectral 
> clustering etc which can be more efficient for larger datasets
> 
> -------------------------------------------------------------------
> Rajarshi Guha  <[EMAIL PROTECTED]>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> After an instrument has been assembled, extra components will 
> be found on the bench.
> 
> 
> 
> 



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to