new hint : I commented out the ((LuceneIndexService) indexService).enableCache( "uri", 500000 ); and it started to work. Is this cache setting mixed up with commit() ?
> It does not solve the problem. I removed the if (count>=10000) { block > and do commit after every rc.add but program fails on last file. > More over I did run the program with rc.add( file, "", > RDFFormat.NTRIPLES,context); commiting only once, after all files are > loaded and it is finished successfully. > > > > file_: links_uscensus_en.nt > [WARNING] an additional exception was thrown > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:592) > at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283) > at java.lang.Thread.run(Thread.java:613) > Caused by: java.lang.OutOfMemoryError: Java heap space > at > org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:172) > at > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136) > at > org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) > at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) > at > org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:144) > at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:130) > at org.apache.lucene.search.TermScorer.score(TermScorer.java:74) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:248) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173) > at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:113) > at org.apache.lucene.search.Hits.<init>(Hits.java:80) > at org.apache.lucene.search.Searcher.search(Searcher.java:52) > at org.apache.lucene.search.Searcher.search(Searcher.java:42) > at > org.neo4j.index.lucene.LuceneIndexService.searchForNodes(LuceneIndexService.java:387) > at > org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:272) > at > org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:228) > at > org.neo4j.index.lucene.LuceneIndexService.getSingleNode(LuceneIndexService.java:405) > at > org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupNode(AbstractUriBasedExecutor.java:162) > at > org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupOrCreateNode(AbstractUriBasedExecutor.java:177) > at > org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.handleAddObjectRepresentation(VerboseQuadExecutor.java:262) > at > org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.addToNodeSpace(VerboseQuadExecutor.java:70) > at org.neo4j.rdf.store.RdfStoreImpl.addStatement(RdfStoreImpl.java:89) > at org.neo4j.rdf.store.RdfStoreImpl.addStatements(RdfStoreImpl.java:69) > at > org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.internalAddStatement(GraphDatabaseSailConnectionImpl.java:623) > at > org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.innerAddStatement(GraphDatabaseSailConnectionImpl.java:440) > at > org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.addStatement(GraphDatabaseSailConnectionImpl.java:478) > at > org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:228) > at > org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:460) > at gov.lanl.memento.core.Test.get(Test.java:229) > at gov.lanl.memento.core.Test.main(Test.java:103) > [INFO] > ------------------------------------------------------------------------ > [ERROR] BUILD ERROR > [INFO] > ------------------------------------------------------------------------ > [INFO] An exception occured while executing the Java class. Java heap > space > > [INFO] > ------------------------------------------------------------------------ > [INFO] Trace > org.apache.maven.lifecycle.LifecycleExecutionException: An exception > occured while executing the Java class. Java heap space > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:592) > at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) > at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) > at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) > at org.codehaus.classworlds.Launcher.main(Launcher.java:375) > Caused by: org.apache.maven.plugin.MojoExecutionException: An exception > occured while executing the Java class. Java heap space > at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338) > at > org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451) > at > org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558) > ... 16 more > Caused by: java.lang.OutOfMemoryError: Java heap space > at > org.neo4j.kernel.impl.cache.AdaptiveCacheManager.adaptCaches(AdaptiveCacheManager.java:237) > at > org.neo4j.kernel.impl.cache.AdaptiveCacheManager$AdaptiveCacheWorker.run(AdaptiveCacheManager.java:218) > [INFO] > ------------------------------------------------------------------------ > [INFO] Total time: 4 minutes 33 seconds > [INFO] Finished at: Thu Apr 15 09:34:27 MDT 2010 > [INFO] Final Memory: 12M/61M > [INFO] ------------------------------------------ > > > > >> You increment the "count" variable even if the line is empty... which >> means >> that it could possibly skip the 10000 mark sometimes, change the >> condition >> to "if(count >=10000)" instead. Also how much heap have you given the >> JVM? >> >> 2010/4/14 Lyudmila L. Balakireva <lu...@lanl.gov> >> >>> Hi, >>> I was loading on by file basis and was commiting after each file. >>> Even though 22 mln file was finished in 5 hours and 66 mln did not >>> finish in 3 days. >>> I rewrite the program to read file and commit after some amount of >>> records in hope better to control memory >>> but program fails with " out of memory error" even for small dataset >>> (80 000) . (With file approach the small dataset was loading without >>> problem). >>> my snippet: >>> for ( File file : files ) >>> { SimpleTimer timer = new SimpleTimer(); >>> FileInputStream in = new >>> FileInputStream(file); >>> BufferedReader br = new >>> BufferedReader(new >>> InputStreamReader(in)); >>> String strLine; >>> int count=0; >>> while ((strLine = br.readLine()) != >>> null) { >>> count = count+1; >>> >>> if (strLine.trim().length() >>> != >>> 0) >>> { >>> String[] result = >>> strLine.split("\\s"); >>> >>> >>> rc.add(f.createURI(stripeN3(result[0])),f.createURI(stripeN3(result[1])),f.createURI(stripeN3(result[2])), >>> context) ; >>> >>> >>> if >>> (count==10000) >>> { >>> //rc.add( file, >>> "", >>> RDFFormat.NTRIPLES,context); >>> >>> rc.commit(); >>> count = 0; >>> } >>> } >>> } >>> >>> br.close(); >>> in.close(); >>> rc.commit(); >>> >>> timer.end(); >>> } >>> >>> sumtimer.end(); >>> rc.commit(); >>> rc.close(); >>> } >>> >>> What can cause the problem : >>> INFO] Trace >>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception >>> occured while executing the Java class. Java heap space >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142) >>> at >>> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336) >>> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129) >>> at org.apache.maven.cli.MavenCli.main(MavenCli.java:287) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:592) >>> at >>> org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315) >>> at org.codehaus.classworlds.Launcher.launch(Launcher.java:255) >>> at >>> org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430) >>> at org.codehaus.classworlds.Launcher.main(Launcher.java:375) >>> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception >>> occured while executing the Java class. Java heap space >>> at >>> org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338) >>> at >>> >>> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451) >>> at >>> >>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558) >>> ... 16 more >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:350) >>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:373) >>> at >>> org.neo4j.kernel.impl.transaction.XidImpl.getNewGlobalId(XidImpl.java:55) >>> at >>> >>> org.neo4j.kernel.impl.transaction.TransactionImpl.<init>(TransactionImpl.java:67) >>> at >>> org.neo4j.kernel.impl.transaction.TxManager.begin(TxManager.java:497) >>> at >>> org.neo4j.kernel.EmbeddedGraphDbImpl.beginTx(EmbeddedGraphDbImpl.java:238) >>> at >>> >>> org.neo4j.kernel.EmbeddedGraphDatabase.beginTx(EmbeddedGraphDatabase.java:139) >>> at >>> >>> org.neo4j.index.impl.GenericIndexService.beginTx(GenericIndexService.java:105) >>> at >>> org.neo4j.index.impl.IndexServiceQueue.run(IndexServiceQueue.java:221) >>> [INFO] >>> ------------------------------------------------------------------------ >>> [INFO] Total time: 5 minutes 14 seconds >>> >>> >>> Thank you for the help, >>> Lyudmila >>> >>> >>> >>> >>> > There are some problems at the moment regarding insertion speeds. >>> > >>> > o We haven't yet created an rdf store which can use a BatchInserter >>> (which >>> > could also be tweaked to ignore checking if statements already exists >>> > before >>> > it adds each statement and all that). >>> > o The other one is that the sail layer on top of the neo4j-rdf >>> component >>> > contains functionality which allows a thread to have more than one >>> running >>> > transaction at the same time. This was added due to some users >>> > requirements, >>> > but slows it down by a factor 2 or something (not sure about this). >>> > >>> > I would like to see both these issues resolved soon, and when they >>> are >>> > fixed >>> > insertion speeds will be quite nice! >>> > >>> > 2010/4/9 Lyudmila L. Balakireva <lu...@lanl.gov> >>> > >>> >> Hi, >>> >> How to optimize loading to the VerboseQuadStore? >>> >> I am doing test similar to the test example from neo rdf sail >>> and >>> it >>> >> is very slow. The size of files 3G - 7G . >>> >> Thanks, >>> >> Luda >>> >> _______________________________________________ >>> >> Neo mailing list >>> >> User@lists.neo4j.org >>> >> https://lists.neo4j.org/mailman/listinfo/user >>> >> >>> > >>> > >>> > >>> > -- >>> > Mattias Persson, [matt...@neotechnology.com] >>> > Hacker, Neo Technology >>> > www.neotechnology.com >>> > _______________________________________________ >>> > Neo mailing list >>> > User@lists.neo4j.org >>> > https://lists.neo4j.org/mailman/listinfo/user >>> > >>> >>> _______________________________________________ >>> Neo mailing list >>> User@lists.neo4j.org >>> https://lists.neo4j.org/mailman/listinfo/user >>> >> >> >> >> -- >> Mattias Persson, [matt...@neotechnology.com] >> Hacker, Neo Technology >> www.neotechnology.com >> _______________________________________________ >> Neo mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > > _______________________________________________ > Neo mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user