Re: Indexing large documents

Peter Manis Mon, 20 Aug 2007 04:09:58 -0700

The that should show some errors if something goes wrong, if not the
console usually will.  The errors will look like a java stacktrace
output.  Did increasing the heap do anything for you?  Changing mine
to 256mb max worked fine for all of our files.


On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> Well, I am using the java textmining library to extract text from documents,
> then i do a post to solr
> I do not have an error log, i only have *.request.log files in the logs
> directory
>
> Thanks
>
> On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote:
> >
> > Fouad,
> >
> > I would check the error log or console for any possible errors first.
> > They may not show up, it really depends on how you are processing the
> > word document (custom solr, feeding the text to it, etc).  We are
> > using a custom version of solr with PDF, DOC, XLS, etc text extraction
> > and I have successfully indexed 40mb documents.  I did have indexing
> > problems with a large document or two and simply increasing the heap
> > size fixed the problem.
> >
> > - Pete
> >
> > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > > Hello,
> > >
> > > I am using solr to index text extracted from word documents, and it is
> > > working really well.
> > > Recently i started noticing that some documents are not indexed, that is
> > i
> > > know that the word foobar is in a document, but when i search for foobar
> > the
> > > id of that document is not returned.
> > > I suspect that this has to do with the size of the document, and that
> > > documents with a lot of text are not being indexed.
> > > Please advise.
> > >
> > > thanks,
> > > fmardini
> > >
> >
>

Re: Indexing large documents

Reply via email to