Although a couple of people mentioned that you can do this
since Nutch is open source, I'd like to play devil's advocate
and say that it is difficult to do #3.
Although you can make little tweaks pretty easily like
boosting words in the title or URL, changing the main
crawling algorithm and/or searching algorithm requires
lots of changes to core code. If you change it, it will
be difficult to merge future changes into your code.
You can definitely do it though. You should just know
what you're getting into.
Howie
Dear nutchers
This is my first time that i ask a question to nutch users.
I am a researcher working on web retreval and i am asking if i can use
nutch for the following:
1- Can i make nutch begin from a seed urls brought through the Google
API ?
2- Can i see the algorithms that make crawling and compare queries to
search results?
3- Can i modify these algorithms and replace them with my own
algorithms?
---------------------------------
Blab-away for as little as 1ยข/min. Make PC-to-Phone Calls using Yahoo!
Messenger with Voice.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general