First of all thanks for your reply.

Am really got confused !! pardon me..
I dont know whether i  need to put the given code by creating new class in
nutch directory?
 Do i have to import other classes or packages..?? any thing i need to take
care of??

I have tried creating a new separate class in nutch directory..but gives
lotsa errors related to packages/class not found.Still try to figuring out
whats wrong there.

Secondly How should am able to read the urls from crawldb once the class get
running..I have know idea how should i figure it out..

How can fit the output of my url in some xml format.i.e.
<url>
    <loc>http://www.example.com/</loc>
  </url>
<url>
    <loc>http://www.example1.com/</loc>
  </url>
...........
So can you please elaborate me how should i do this..

Thanks a lot for your time..

Cheers,
Cha

Enis Soztutar wrote:
> 
> cha wrote:
>> Thanks enis,
>>
>> am getting some idea from that..
>> Can you tell me in which class i should implement that.
>> I havent have hadoop install on my box.
>>
>>   
> Just  make a new class in nutch and put the code there : ) As long as 
> you have hadoop jar in your classpath, you do not need to checkout the 
> hadoop codebase.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/extracting-urls-into-text-files-tf3409030.html#a9568050
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to