Thanks for your valuable comment on subcollection, but still i have some issues, 1.enabling subcollection in nutch-site.xml mean at time of crawling, can it is possible if it is on direcly on index (means at searching) 2.in your message can u explain comment like subcollection also includes a query plugin
i done steps mentioned by you, but when i execute command like subcollection:<name of subcollection> <word for search> still i get result 0 hits...... can u explain Subcollection more deeply because our aim is to searching on specific URL? is any other way other than subcollection ? Enis Soztutar wrote: > > prashant_nutch wrote: >> IS Subcollection useful for specific URL Searching ? >> How we activate subcollection at indexing and searching time? >> >> in conf/subcollection , >> if we include our URL in whitelist ,then only we have search on that >> URLs? >> command for searching on subcollection >> >> Subcollection :< Name of subcollection> < word for specific URL> >> >> >> <?xml version="1.0" encoding="UTF-8"?> >> <subcollections> >> <subcollection> >> <name>nutch</name> >> <id>nutch</id> >> <whitelist> >> >> http://lucene.apache.org/nutch/ >> http://wiki.apache.org/nutch/ >> </whitelist> >> <blacklist /> >> </subcollection> >> </subcollections> >> >> can anybody explain how overall thing should work ? >> can it is useful for specific URL searching ?(we are using nutch 0.8.1) >> >> > Subcollection is a very useful way to group a set of urls and then > assign a label for them. You can use it to limit searching to certain > urls. > > You should first enable subcollection in the nutch-site.xml file. > Then you should add collections to the conf/subcollection.xml file. > After indexing, the documents with the matched urls should have the > subcollection field in the index. > After that, since subcollection also includes a query plugin, you can do > searches like > > java subcollection:nutch > > To limit the search to the nutch collection. You can consult the readme > file in the plugin's directory. > > > > > > > -- View this message in context: http://www.nabble.com/Help-on-Activation-of-Subcollection-at-Indexing---searching-tf3490590.html#a9752653 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
