Lochana, On Wed, Sep 18, 2013 at 8:23 PM, Egon Willighagen <[email protected]> wrote: > I'll try in a second to process ChEMBL with the iterating reader to > see how long that takes...
With the IteratingMDLReader and the DefaultChemObjectBuilder for a
2.8GB ChEMBL SD file (see their FTP site):
real 9m58.781s
user 9m14.000s
sys 0m8.528s
I used this Groovy code (carbon copy from my book;
code/IteratingMDLReaderDemo.groovy):
import org.openscience.cdk.interfaces.*;
import org.openscience.cdk.io.*;
import org.openscience.cdk.io.iterator.*;
import org.openscience.cdk.*;
import org.openscience.cdk.tools.manipulator.*;
iterator = new IteratingMDLReader(
new File("chembl_17.sdf").newReader(),
DefaultChemObjectBuilder.getInstance()
)
while (iterator.hasNext()) {
IMolecule mol = iterator.next()
}
As the load on the two cores is not 100%, it seems largely disk
throughput limited... see the attached screenshot. Also notice that
after some seconds the CPU load drops and the disk throughput
doubles... I guess this is JIT kicking in...
With the SilentChemObjectBuilder these are the timings:
real 9m46.986s
user 8m47.536s
sys 0m8.708s
How does this compare with your 'extremely slow' ?
Grtz,
Egon
--
Dr E.L. Willighagen
Postdoctoral Researcher
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286
<<attachment: cpuUse.png>>
------------------------------------------------------------------------------ LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

