I am trying to crawl the URL:
http://answers.yahoo.com/dir/index;_ylt=AmQOyqS3boseCSYsZxA495Xpy6IX;_ylv=3?link=list&sid=396545327
with special characters '?' and '='. This URL belongs to Dining-out category
of answers.yahoo.com. And I want to crawl the URLs that fall under this sub
category. But it seemed to get skipped. I have attached my urllist.txt,
regex-urlfilter.txt and crawl-urlfilter.txt with this. Has anyone done
similar kind of crawling before? 
http://old.nabble.com/file/p26197881/regex-urlfilter.txt regex-urlfilter.txt 
http://old.nabble.com/file/p26197881/crawl-urlfilter.txt crawl-urlfilter.txt 
http://old.nabble.com/file/p26197881/urllist.txt urllist.txt 
-- 
View this message in context: 
http://old.nabble.com/How-to-fetch-URLs-with-special-charaters-%27-%27---%27%3D%27-tp26197881p26197881.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to