Just for your information on the BCUT descriptor On my windowsXP Intel 2.4 GHz, 1GB RAM, JDK 1.6)
50 structures was 1 min 15 sec Now 12 sec 5059 structures was 1h 32m 44s Now 16m 38s So big improvement. Thanks again. > -----Original Message----- > From: Rajarshi Guha [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 19, 2008 19:45 > To: Peter Maas > Cc: CDK users list > Subject: Re: [Cdk-user] a very strange problem wrt argument > passing and memory locations > > > On Feb 19, 2008, at 12:03 PM, Peter Maas wrote: > > > Well I copied most of your examples. > > Please find it enclosed. > > I'm running it against our 10 mg stock (>250k structures). > > Hmm, I ran it on OS X (2.2 GHz, 1GB RAM, JDK 1.5) and it processed > 277 relatively small molecules in 7 sec. > > I did some rough testing of my own and I used a molecule from > PubChem (CID = 52) which has 55 heavy atoms. It turns out > that the polarizability code takes nearly a minute to run. > > I tweaked the polarizability calculation so that now it takes > 4sec to run, bringing the processing time for this molecule > down to 4.9s. > Also, the 277 SDF file I mentioned above now takes 3.1s > > However it does still slow down for some large molecules > (such as Pubchem CID 182) and I suspect that path length > calculation could be improved. I'll look at that in a few > days. Are your molecules very large? > > In any case, the latest improvements are in SVN, so you > should sync and recompile. Things should go faster. > > > I like to give R a shot clustering it but I'm afraid R also > will not > > be up to it. > > Well creating a 250K x 250K distance matrix will bring most > machines to their knees, unless you have a very large amount > of RAM. But you could look at methods like spectral > clustering etc which can be more efficient for larger datasets > > ------------------------------------------------------------------- > Rajarshi Guha <[EMAIL PROTECTED]> > GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE > ------------------------------------------------------------------- > After an instrument has been assembled, extra components will > be found on the bench. > > > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user