new hint :
I commented out the  ((LuceneIndexService) indexService).enableCache(
"uri", 500000 );
and  it started to work. Is this cache setting  mixed up with commit() ?


> It does not solve the problem.  I removed the if (count>=10000) {   block
> and  do commit after every rc.add   but program    fails on last file.
> More over I  did run the program with   rc.add( file, "",
> RDFFormat.NTRIPLES,context); commiting only once, after all files are
> loaded and it is finished successfully.
>
>
>
> file_: links_uscensus_en.nt
> [WARNING] an additional exception was thrown
> java.lang.reflect.InvocationTargetException
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:592)
>       at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283)
>       at java.lang.Thread.run(Thread.java:613)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>       at
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:172)
>       at
> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136)
>       at
> org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247)
>       at
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157)
>       at
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>       at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80)
>       at 
> org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:144)
>       at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:130)
>       at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:248)
>       at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
>       at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:113)
>       at org.apache.lucene.search.Hits.<init>(Hits.java:80)
>       at org.apache.lucene.search.Searcher.search(Searcher.java:52)
>       at org.apache.lucene.search.Searcher.search(Searcher.java:42)
>       at
> org.neo4j.index.lucene.LuceneIndexService.searchForNodes(LuceneIndexService.java:387)
>       at
> org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:272)
>       at
> org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:228)
>       at
> org.neo4j.index.lucene.LuceneIndexService.getSingleNode(LuceneIndexService.java:405)
>       at
> org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupNode(AbstractUriBasedExecutor.java:162)
>       at
> org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupOrCreateNode(AbstractUriBasedExecutor.java:177)
>       at
> org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.handleAddObjectRepresentation(VerboseQuadExecutor.java:262)
>       at
> org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.addToNodeSpace(VerboseQuadExecutor.java:70)
>       at org.neo4j.rdf.store.RdfStoreImpl.addStatement(RdfStoreImpl.java:89)
>       at org.neo4j.rdf.store.RdfStoreImpl.addStatements(RdfStoreImpl.java:69)
>       at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.internalAddStatement(GraphDatabaseSailConnectionImpl.java:623)
>       at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.innerAddStatement(GraphDatabaseSailConnectionImpl.java:440)
>       at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.addStatement(GraphDatabaseSailConnectionImpl.java:478)
>       at
> org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:228)
>       at
> org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:460)
>       at gov.lanl.memento.core.Test.get(Test.java:229)
>       at gov.lanl.memento.core.Test.main(Test.java:103)
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO]
> ------------------------------------------------------------------------
> [INFO] An exception occured while executing the Java class. Java heap
> space
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> occured while executing the Java class. Java heap space
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
>       at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
>       at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
>       at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:592)
>       at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
>       at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
>       at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>       at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
> occured while executing the Java class. Java heap space
>       at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
>       at
> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
>       at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
>       ... 16 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
>       at
> org.neo4j.kernel.impl.cache.AdaptiveCacheManager.adaptCaches(AdaptiveCacheManager.java:237)
>       at
> org.neo4j.kernel.impl.cache.AdaptiveCacheManager$AdaptiveCacheWorker.run(AdaptiveCacheManager.java:218)
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 4 minutes 33 seconds
> [INFO] Finished at: Thu Apr 15 09:34:27 MDT 2010
> [INFO] Final Memory: 12M/61M
> [INFO] ------------------------------------------
>
>
>
>
>> You increment the "count" variable even if the line is empty... which
>> means
>> that it could possibly skip the 10000 mark sometimes, change the
>> condition
>> to "if(count >=10000)" instead. Also how much heap have you given the
>> JVM?
>>
>> 2010/4/14 Lyudmila L. Balakireva <lu...@lanl.gov>
>>
>>> Hi,
>>>  I  was loading on by file basis and was commiting after each file.
>>> Even  though 22 mln file   was finished in 5 hours and 66 mln did not
>>> finish in 3 days.
>>> I rewrite the program to read file and commit after some  amount of
>>> records in hope better to control memory
>>>  but program  fails with " out of memory error"  even for small dataset
>>> (80 000) . (With file approach the small dataset was  loading  without
>>> problem).
>>> my snippet:
>>> for ( File file : files )
>>>                        {     SimpleTimer timer = new SimpleTimer();
>>>                                  FileInputStream in = new
>>> FileInputStream(file);
>>>                                        BufferedReader br = new
>>> BufferedReader(new
>>> InputStreamReader(in));
>>>                                    String strLine;
>>>                                    int count=0;
>>>                                    while ((strLine = br.readLine()) !=
>>> null)   {
>>>                                        count = count+1;
>>>
>>>                                           if (strLine.trim().length()
>>> !=
>>> 0)
>>> {
>>>                                              String[] result =
>>> strLine.split("\\s");
>>>
>>>
>>> rc.add(f.createURI(stripeN3(result[0])),f.createURI(stripeN3(result[1])),f.createURI(stripeN3(result[2])),
>>> context) ;
>>>
>>>
>>>                                                        if
>>> (count==10000)
>>> {
>>>                                                        //rc.add( file,
>>> "",
>>> RDFFormat.NTRIPLES,context);
>>>
>>>                                                        rc.commit();
>>>                                                        count = 0;
>>>                                                         }
>>>                                              }
>>>                                     }
>>>
>>>                                     br.close();
>>>                                     in.close();
>>>                                     rc.commit();
>>>
>>>                                timer.end();
>>>                        }
>>>
>>>                        sumtimer.end();
>>>                        rc.commit();
>>>                        rc.close();
>>>                }
>>>
>>> What can cause the problem :
>>> INFO] Trace
>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>> occured while executing the Java class. Java heap space
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
>>>        at
>>> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
>>>        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
>>>        at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>        at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>        at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>        at java.lang.reflect.Method.invoke(Method.java:592)
>>>        at
>>> org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
>>>        at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
>>>        at
>>> org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>>>        at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
>>> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
>>> occured while executing the Java class. Java heap space
>>>        at
>>> org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
>>>        at
>>>
>>> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
>>>        at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
>>>        ... 16 more
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>        at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>>>        at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>>>        at
>>> org.neo4j.kernel.impl.transaction.XidImpl.getNewGlobalId(XidImpl.java:55)
>>>        at
>>>
>>> org.neo4j.kernel.impl.transaction.TransactionImpl.<init>(TransactionImpl.java:67)
>>>        at
>>> org.neo4j.kernel.impl.transaction.TxManager.begin(TxManager.java:497)
>>>        at
>>> org.neo4j.kernel.EmbeddedGraphDbImpl.beginTx(EmbeddedGraphDbImpl.java:238)
>>>        at
>>>
>>> org.neo4j.kernel.EmbeddedGraphDatabase.beginTx(EmbeddedGraphDatabase.java:139)
>>>        at
>>>
>>> org.neo4j.index.impl.GenericIndexService.beginTx(GenericIndexService.java:105)
>>>        at
>>> org.neo4j.index.impl.IndexServiceQueue.run(IndexServiceQueue.java:221)
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Total time: 5 minutes 14 seconds
>>>
>>>
>>> Thank you for the help,
>>> Lyudmila
>>>
>>>
>>>
>>>
>>> > There are some problems at the moment regarding insertion speeds.
>>> >
>>> > o We haven't yet created an rdf store which can use a BatchInserter
>>> (which
>>> > could also be tweaked to ignore checking if statements already exists
>>> > before
>>> > it adds each statement and all that).
>>> > o The other one is that the sail layer on top of the neo4j-rdf
>>> component
>>> > contains functionality which allows a thread to have more than one
>>> running
>>> > transaction at the same time. This was added due to some users
>>> > requirements,
>>> > but slows it down by a factor 2 or something (not sure about this).
>>> >
>>> > I would like to see both these issues resolved soon, and when they
>>> are
>>> > fixed
>>> > insertion speeds will be quite nice!
>>> >
>>> > 2010/4/9 Lyudmila L. Balakireva <lu...@lanl.gov>
>>> >
>>> >> Hi,
>>> >> How to optimize loading to the VerboseQuadStore?
>>> >>  I am doing test similar to the  test example from neo rdf sail
>>> and
>>> it
>>> >> is  very slow.  The size of  files 3G - 7G .
>>> >> Thanks,
>>> >> Luda
>>> >> _______________________________________________
>>> >> Neo mailing list
>>> >> User@lists.neo4j.org
>>> >> https://lists.neo4j.org/mailman/listinfo/user
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Mattias Persson, [matt...@neotechnology.com]
>>> > Hacker, Neo Technology
>>> > www.neotechnology.com
>>> > _______________________________________________
>>> > Neo mailing list
>>> > User@lists.neo4j.org
>>> > https://lists.neo4j.org/mailman/listinfo/user
>>> >
>>>
>>> _______________________________________________
>>> Neo mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>
>>
>>
>> --
>> Mattias Persson, [matt...@neotechnology.com]
>> Hacker, Neo Technology
>> www.neotechnology.com
>> _______________________________________________
>> Neo mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
> _______________________________________________
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

_______________________________________________
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to