The WorkspaceInfo class in unneccessary. The WorkspaceDetails can be persisted directly if reworked.
-----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Monday, May 22, 2006 5:48 PM To: java-dev@lucene.apache.org Subject: Re: caching term information? Robert Engels wrote: > I was amazed at how much time is spent in both readVint and readByte(). > Seems high, but I think it is mainly due to the number of invocations. Profilers have been known to exaggerate this sort of thing. These are central routines of Lucene, but they're also pretty simple and hard to make a lot faster. > 1) What if BufferedIndexInput had an optimized version of readVint that used > the buffer and manipulated the position directly? Give it a try and see if it's much faster. Sun's JVMs are pretty smart these days, and such micro-optimizations are proving less likely to improve things than they used to be. Also, we don't want to tune things too highly for any given JVM, so it would have to be substantially faster to warrant committing something like this. > 2) Instead of caching the TermInfo, what if the TermDocs were cached (again > for the top 20% terms). The memory requirement would be much greater, but > you could also say "do not cache the TermDocs that had more than X > documents". The optimized searcher already converts TermQueries similar to > this to a Filter anyway. The majority of query time is typically spent processing terms that occur in lots of documents. Terms that occur in only few documents are faster to process, so speeding them doesn't affect overall performance as much as one might hope. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]