Lochana,

On Wed, Sep 18, 2013 at 8:23 PM, Egon Willighagen
<[email protected]> wrote:
> I'll try in a second to process ChEMBL with the iterating reader to
> see how long that takes...

With the IteratingMDLReader and the DefaultChemObjectBuilder for a
2.8GB ChEMBL SD file (see their FTP site):

real    9m58.781s
user    9m14.000s
sys     0m8.528s

I used this Groovy code (carbon copy from my book;
code/IteratingMDLReaderDemo.groovy):

import org.openscience.cdk.interfaces.*;
import org.openscience.cdk.io.*;
import org.openscience.cdk.io.iterator.*;
import org.openscience.cdk.*;
import org.openscience.cdk.tools.manipulator.*;

iterator = new IteratingMDLReader(
  new File("chembl_17.sdf").newReader(),
  DefaultChemObjectBuilder.getInstance()
)
while (iterator.hasNext()) {
  IMolecule mol = iterator.next()
}

As the load on the two cores is not 100%, it seems largely disk
throughput limited... see the attached screenshot. Also notice that
after some seconds the CPU load drops and the disk throughput
doubles... I guess this is JIT kicking in...

With the SilentChemObjectBuilder these are the timings:

real    9m46.986s
user    8m47.536s
sys     0m8.708s

How does this compare with your 'extremely slow' ?

Grtz,

Egon

-- 
Dr E.L. Willighagen
Postdoctoral Researcher
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286

<<attachment: cpuUse.png>>

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to