Many thanks, Mike. UNLIMITED is right for me then. Happily, it is a reasonably controlled environment.
-----Original Message----- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: 10 December 2009 10:03 To: java-user@lucene.apache.org Subject: Re: IndexWriter.MaxFieldLength.UNLIMITED at what price? LIMITED is basically an insurance policy, protecting you from accidentally indexing an immense document, leading to OOME. It also protects you in case your analyzer is accidentally letting in bogus terms (say, if you indexed a large exe file, or there was a large base64-encoded attachment on an email message that you didn't decode). But, LIMITED is very bad for the user experience. Users will eventually catch on that your search engine is "buggy", and lose trust. I'd recommend always using UNLIMITED unless you're in a domain where there are risks of getting massive docs. And even then I'd first try to create other mechanisms to try to not index such documents... Mike On Thu, Dec 10, 2009 at 3:15 AM, Rob Staveley (Tom) <rstave...@seseit.com> wrote: > I was wondering where I might read about the cost of using > IndexWriter.MaxFieldLength.UNLIMITED versus > IndexWriter.MaxFieldLength.LIMITED. > > > > Are thee any consequences over and above the obvious one that you are going > to analyse more content in your IndexWriter when you have more than 10,000 > characters in a StringBuffer? > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org