Hi dev@,
I've documented my Tez journey so far at
https://cwiki.apache.org/confluence/display/NUTCH/Running+Nutch+on+Tez
Things are getting quite interesting.
Please share any experiences using Nutch on Tez or improvements to the
documentation especially any experiments you can document.
Thank y
Hi Markus,
Thanks for chiming in :)
My responses below
On 2020/12/21 21:32:08, Markus Jelsma wrote:
> Hello Lewis,
>
> 1. counters, for me they are a requirement to have as they are key to regular
> inspections of ongoing crawls, finding errors and debugging. I hope you can
> find a work arou
Hello Lewis,
1. counters, for me they are a requirement to have as they are key to regular
inspections of ongoing crawls, finding errors and debugging. I hope you can
find a work around.
2. sounds interesting, but i'd like to see the test run with 12M rather than
12k URLs.
A question, are the
Hi dev@,
Short update here. I've documented my initial observations running Nutch on Tez
at https://s.apache.org/viee3
Specific early finding are as follows
1. Counters don't appear to work... which makes sense as all existing counters
are manifested using the MapReduce framework. I'm not sure if
4 matches
Mail list logo