date:20120426

Re: Changing from Indexing Filter

2012-04-26 Thread Lewis John Mcgibbney

Hi Jim, On Thu, Apr 26, 2012 at 2:23 PM, Jim Chandler wrote: > I am in the > process of trying to change a plugin from an IndexingFilter to a Parser. Personally I wouldn't do this, I would pick up an existing parser and edit it into another parser! Do you have any specific reasons for doing this

fields foreach document

2012-04-26 Thread Ing. Eyeris Rodriguez Rueda

hello, I'm using nutch with solr and i need to know for each type of document crawled by nutch(pdf,docx,ppt) which are the fields recognized on each document. I know that tika parser is incharged of parsing the documents founds on the crawl process but i need to know for all documents supported

Generator OOM

2012-04-26 Thread Markus Jelsma

Hi, We sometimes see the generator running OOM. This happens because we either have a too high topN value or too many segments to generate. In any case, a very large amount of records is being generated with the same (lowest) score and end up in a single reducer. We limit the generator by dom

Changing from Indexing Filter

2012-04-26 Thread Jim Chandler

Greetings, Nutch, Solr, Lucene and everything else is very new to me. I am in the process of trying to change a plugin from an IndexingFilter to a Parser. I am having difficultying understanding where in the nutch process each one of these is run. I've been searching Google to see if I could fi

Re: Question related to NUCTH 1044 redirected URLS and invalid scores

2012-04-26 Thread Lewis John Mcgibbney

Hi Pravin, I won't have time until the weekend to get around to this. I'll try my best though when the time comes around. On Tue, Apr 24, 2012 at 4:19 PM, Pravin Agrawal wrote: > Hi Lewis, thanks for the reply. Sorry I couldn't get back to you soon as I > was on vacation. > > > > I tried out t

Re: Changing from Indexing Filter

fields foreach document

Generator OOM

Changing from Indexing Filter

Re: Question related to NUCTH 1044 redirected URLS and invalid scores

5 matches

Site Navigation

Mail list logo

Footer information