Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "GoogleSummerOfCode/SitemapCrawler/weeklyreport" page has been changed by 
CihadGuzel:
https://wiki.apache.org/nutch/GoogleSummerOfCode/SitemapCrawler/weeklyreport?action=diff&rev1=4&rev2=5

Comment:
Added week2 report.

  
  || '''Week :''' 2 (1 June 2015 - 7 June 2015) ||
  
- '''Title :''' 
+ '''Title :''' Sitemap detection is done. 
  
+ Robot.txt file is checked while fetcher job is run. If robot.txt file have 
any sitemap urls, these are written to database. A column called sitemap(stm) 
for sitemap is added to db schema. The urls in stm column from db will be 
parsed at the next time.
- ----
- Example:
  
+ 
- || '''Week :''' 3 (8 June 2015 - 14 June 2015) ||
+ || '''Week :''' 3 (8 June 2015 - 21 June 2015) ||
  
  '''Title :''' 
  
  ----
  Example:
  
- || '''Week :''' 4 (15 June 2015 - 21 June 2015) ||
+ || '''Week :''' 4 (22 June 2015 - 28 June 2015) ||
  
  '''Title :''' 
  

Reply via email to