Yes this mechanism will work, since parse data contains also the
parsed outgoing links of a page (url) so you can rebuild the complete
web graph.
Am 27.01.2006 um 02:03 schrieb Sunnyvale Fl:
I am a bit confused as to recreating the WebDB from segments (using
Nutch
0.7.1). Let's say I have 20 segments, and I think my WebDB is
corrupted so
I create a new db using nutch admin db - create. Then I do
updatedb from
the 20 segments. How does updatedb create the link structure from
segments? I am under the impression that segments contain the
actual page
content and metatags around a page, but not the link structure. No?
Thanks!
---------------------------------------------------------------
company: http://www.media-style.com
forum: http://www.text-mining.org
blog: http://www.find23.net