Thanks for that. Does anyone know how much RAM a 5gb index will need?
With mx set to 27gb, it crashes when it flushes buffer at one point.
"bash-2.03$ Exception in thread "main" java.lang.ExceptionInInitializerError
at TaxonomyFinder.RelatedCatsFinder.<init>(RelatedCatsFinder.java:46)
at
wikipedia.WikipediaAnalyser$ExtractAbstractHandler.endElement(WikipediaAnalyser.java:295)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown
Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at wikipedia.WikipediaAnalyser.parseAbstracts(WikipediaAnalyser.java:184)
at
wikipedia.WikipediaAnalyser.getRelatedCategories(WikipediaAnalyser.java:127)
at TaxonomyFinder.TaxonomyTreeMaker.main(TaxonomyTreeMaker.java:492)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -2097152
at java.util.Vector.elementAt(Unknown Source)
at
org.apache.lucene.store.RAMOutputStream.flushBuffer(RAMOutputStream.java:82)
at
org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:84)
at
org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:52)
at org.apache.lucene.store.RAMDirectory.<init>(RAMDirectory.java:68)
at org.apache.lucene.store.RAMDirectory.<init>(RAMDirectory.java:95)
at
word_coocurrence.WordCooccurrenceFinder.<clinit>(WordCooccurrenceFinder.java:50)
... 13 more"
I build the RAMDirectory from a FSDirectory, using the following:
try {
m_searcher = new IndexSearcher(new
RAMDirectory("index"));
} catch (IOException e) {
e.printStackTrace();
System.exit(1);
}
where "index" is the path for the index.
Any help will be much appreciated.
Michael
On 5/23/06, Daniel Naber <[EMAIL PROTECTED]> wrote:
On Dienstag 23 Mai 2006 08:26, Michael Chan wrote:
> As I have quite a
> bit of RAM (~20gb), is there a way I could store the index in RAM or
> any other way that makes use of it to improve performance?
RAMDirectory has just been fixed (in SVN) to work with indexes > 2 GB.
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]