Re: Indexing large documents

Pieter Berkel Mon, 20 Aug 2007 05:13:41 -0700

You will probably need to increase the value of maxFieldLength in your
solrconfig.xml.  The default value is 10000 which might explain why your
documents are not being completely indexed.


Piete


On 20/08/07, Peter Manis <[EMAIL PROTECTED]> wrote:
>
> The that should show some errors if something goes wrong, if not the
> console usually will.  The errors will look like a java stacktrace
> output.  Did increasing the heap do anything for you?  Changing mine
> to 256mb max worked fine for all of our files.
>
> On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > Well, I am using the java textmining library to extract text from
> documents,
> > then i do a post to solr
> > I do not have an error log, i only have *.request.log files in the logs
> > directory
> >
> > Thanks
> >
> > On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote:
> > >
> > > Fouad,
> > >
> > > I would check the error log or console for any possible errors first.
> > > They may not show up, it really depends on how you are processing the
> > > word document (custom solr, feeding the text to it, etc).  We are
> > > using a custom version of solr with PDF, DOC, XLS, etc text extraction
> > > and I have successfully indexed 40mb documents.  I did have indexing
> > > problems with a large document or two and simply increasing the heap
> > > size fixed the problem.
> > >
> > > - Pete
> > >
> > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > > > Hello,
> > > >
> > > > I am using solr to index text extracted from word documents, and it
> is
> > > > working really well.
> > > > Recently i started noticing that some documents are not indexed,
> that is
> > > i
> > > > know that the word foobar is in a document, but when i search for
> foobar
> > > the
> > > > id of that document is not returned.
> > > > I suspect that this has to do with the size of the document, and
> that
> > > > documents with a lot of text are not being indexed.
> > > > Please advise.
> > > >
> > > > thanks,
> > > > fmardini
> > > >
> > >
> >
>

Re: Indexing large documents

Reply via email to