Hello, I am using nutch-1.2 has encountered a problem.The site is writtenwith
lotus domino, I use the browser to enter, click on the emergence of
thoseconnections have not changed the site URL, unlike some sites have a lot of
suffixes.Then there is a web site is buptoa.bupt.edu.cn /
Hi
If it is a text file then you can simply associate the extension with text
parser. But if I understand you right it's a lotus Db file then I suspect
you have no other choice than implementing your own parser. I haven't heard
of lotus files support in nutch.
Best Regards
Alexander Aristov
Absolutely...
There is a short (old) thread here on this topic [1], from what I can see
this issue has not been addressed. Therefore it looks like implementing your
own parser plugin is what's required.
[1]
http://www.lucidimagination.com/search/document/a8d53fac1caa578c/nutch_with_nsf_files
3 matches
Mail list logo