Sure there are other options. You could decide to index in chunks
rather then entire  documents. You could decide many things.
None of which we can recommend unless we have a clue what
you're really trying to accomplish or whether you're encountering
a specific problem.

I can say that we've indexed 7,000 *page* documents by bumping the
MaxFieldLength. The performance is fine. I didn't measure indexing
performance, but it ran acceptably quickly. Search performance seems
unaffected, it's mostly dependent upon the overall index size and
number of unique tokens as far as I can tell.

I suggest you just try it and measure, that's the only way to determine
whether *your* situation is adversely affected, since nobody can answer
such a general question without considerably more specifics, and even
then the answer is a qualified guess.

But if you're *really* asking whether bumping MaxFieldLength does
something like reserve that much space for every document whether
or not it needs to, the answer is "no". A MaxFieldLength of 1,000,000,000
won't use noticeably more resources for a file with 10 tokens than if the
MaxFieldLength were 100. As far as I know.

Best
Erick

On Tue, Mar 10, 2009 at 10:47 AM, Amy Zhou <amy.z...@systemware.com> wrote:

> My issue here is that large file is truncated with default MaxFieldLength
> 10,000 during indexing. The file size I index could be 10mb or larger.
>
> My questions are:
>
> 1) If I chose MaxFieldLength as UNLIMITED instead of 100,000, what the
> performance could be?
> 2) Any other options?
>
>
> -----Original Message-----
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: Tuesday, March 10, 2009 9:37 AM
> To: java-user@lucene.apache.org
> Subject: Re: index large size file
>
> Amy Zhou wrote:
> > Hi,
> >
> > I'm having a couple of questions about indexing large size file. As my
> understanding, the default MaxFieldLength 100,000. In Lucene 2.4, we can set
> the MaxFieldLength during constructor. My questions are:
> >
> The default is 10,000.
> > 1) How's the performance if MaxFieldLength is set to UNLIMITED?
> >
> It depends on how long your documents are. Its simply a cutoff -
> documents longer than n (10,000 by default) terms will be truncated.
> > 2) Any other options for indexing large size file?
> >
> What is the problem you are trying to address? Are you having trouble
> indexing a very large file? Can you share more details?
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to