> Hi, > i am new to Nutch search, i am working from past one > month.Any one can tell what is ment by Vertical search.any > one can suggest how can i do it.
Vertical search [http://en.wikipedia.org/wiki/Vertical_search] is basically "categorized" search. You search "verticals", for e.g. car sales, jobs, vacation rentals etc. The best (by mine opinion) vertical search engine is http://www.vast.com How to do it, not easy! The one way to do it is to use API offered by vast.com (http://www.vast.com/info/stealThisSite). General idea is that vast perform the crawling and classification and you get their results via API. For example http://www.rentalio.com/ is the site that uses data from vast.com (via API) and shows results/search without having to crawl and categories Internet. To make your own vertical search engine, you have to make "categorizer" that will recognise content on crawled page and extract data from it. There are many ways to make categorizer, from "rule based" where you have to make special rules for every site you crawl to fully automated ones based on bayes (or some similar) alghoritm. Links that might help: http://en.wikipedia.org/wiki/Vertical_search [wiki entry on vertical search] http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html [naïve bayes implementation - easy way to classify txt] http://www.vast.com [vertical search engine with free API] http://en.wikipedia.org/wiki/Bayes%27s_theorem [wiki entry on bayes] http://ai.ijs.si/Mezi/pedagosko/markuslang_seminar.doc [naïve bayes implementation explanation] Hope this helps Bogdan Kecman ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
