Hi Jim,
On Thu, Apr 26, 2012 at 2:23 PM, Jim Chandler wrote:
> I am in the
> process of trying to change a plugin from an IndexingFilter to a Parser.
Personally I wouldn't do this, I would pick up an existing parser and
edit it into another parser! Do you have any specific reasons for
doing this
hello, I'm using nutch with solr and i need to know for each type of document
crawled by nutch(pdf,docx,ppt) which are the fields recognized on each
document. I know that tika parser is incharged of parsing the documents founds
on the crawl process but i need to know for all documents supported
Hi,
We sometimes see the generator running OOM. This happens because we
either have a too high topN value or too many segments to generate. In
any case, a very large amount of records is being generated with the
same (lowest) score and end up in a single reducer. We limit the
generator by dom
Greetings,
Nutch, Solr, Lucene and everything else is very new to me. I am in the
process of trying to change a plugin from an IndexingFilter to a Parser. I
am having difficultying understanding where in the nutch process each one
of these is run. I've been searching Google to see if I could fi
Hi Pravin,
I won't have time until the weekend to get around to this.
I'll try my best though when the time comes around.
On Tue, Apr 24, 2012 at 4:19 PM, Pravin Agrawal
wrote:
> Hi Lewis, thanks for the reply. Sorry I couldn't get back to you soon as I
> was on vacation.
>
>
>
> I tried out t
5 matches
Mail list logo