You probably have to literally click on each URL to get the URL it's referencing. Those are URL shorteners and probably won't play nicely with a crawler because of the redirection.
Adam Sent from my iPhone On Jan 26, 2011, at 8:02 AM, Arjun Kumar Reddy <charjunkumar.re...@iiitb.net> wrote: > Hi list, > > I have given the set of urls as > > http://is.gd/Jt32Cf > http://is.gd/hS3lEJ > http://is.gd/Jy1Im3 > http://is.gd/QoJ8xy > http://is.gd/e4ct89 > http://is.gd/WAOVmd > http://is.gd/lhkA69 > http://is.gd/3OilLD > ..... 43 such urls > > And I have run the crawl command bin/nutch crawl urls/ -dir crawl -depth 3 > > *arjun@arjun-ninjas:~/nutch$* bin/nutch readdb crawl/crawldb -stats > *CrawlDb statistics start: crawl/crawldb* > *Statistics for CrawlDb: crawl/crawldb* > *TOTAL urls: 43* > *retry 0: 43* > *min score: 1.0* > *avg score: 1.0* > *max score: 1.0* > *status 3 (db_gone): 1* > *status 4 (db_redir_temp): 1* > *status 5 (db_redir_perm): 41* > *CrawlDb statistics: done* > > When I am trying to read the content from the segments, the content block is > empty for every record. > > Can you please tell me where I can get the content of these urls. > > Thanks and regards,* > *Arjun Kumar Reddy