Kevin A. Burton wrote:

I finally had some time to take Doug's advice and reburn our indexes with a larger TermInfosWriter.INDEX_INTERVAL value.

You know... it looks like the problem is that TermInfosReader uses INDEX_INTERVAL during seeks and is probably just jumping RIGHT past the offsets that I need.


If this is going to be a practical way of reducing Lucene memory footprint for HUGE indexes then its going to need a way to change this value based on the current index thats being opened.

Is there anyway to determine the INDEX_INTERVAL from the file? It looks according to:

http://jakarta.apache.org/lucene/docs/fileformats.html

That the .tis file (which according to the docs the .tii file "is very similar to the .tis file" ) should have this data:

So according to this:

TermInfoFile (.tis)--> TIVersion, TermCount, IndexInterval, SkipInterval, TermInfos

The only problem is that the .tii and .tis files I have on disk don't have a constant preamble and doesnt' look like there's an index interval here...

Kevin

--

Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html

If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod!
Kevin A. Burton, Location - San Francisco, CA
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to