Hi,

It will be appreciable if you will help me in this regard, I want some of
the pages not to be indexed during crawl if they din't meet with specific
criteria??

I am getting the url of those pages in Hadoop log which they don't meet ??
But still those urls along with all the contents are indexed, so what I want
is to delete all those urls or contents from Luke tool , I mean index.

Can any body help me resolving the issue??


Ratnesh V2Solutions India
-- 
View this message in context: 
http://www.nabble.com/How-to-prevent-a-page-from-being-index-during-crawl-or-after-crawl---tf3505149.html#a9788975
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to