Hey utsavi, my requirements are almost same as yours..am still working on the same issue..will definately let you knw once i solved it..
Keep in touch. Cheers, cha utsavi wrote: > > Hi, > > I am newbie to nuch.I have just able to run nutch tutorials. > My requirement is I want to crawl only .htm files from my intranet which > should ignore sessionids. > > After that I want to put all the crawled urls in xml file.I want to write > url into xml using sitemap format which will later submit to google.. > > Is there any way i can achieve this? If yes, provide me the solution ASAP. > > Please help me. > > cheers, > utsavi > > > -- View this message in context: http://www.nabble.com/writing-urls-to-xml-files-tf3427891.html#a9569244 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
