Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "GoogleSummerOfCode/SitemapCrawler/weeklyreport" page has been changed by 
CihadGuzel:
https://wiki.apache.org/nutch/GoogleSummerOfCode/SitemapCrawler/weeklyreport?action=diff&rev1=11&rev2=12

Comment:
Weekly repor have updated

  
  
  = Week : 5 (22 June 2015 - 28 June 2015) =
- ...
  
+ '''Title :''' DbUpdater is updated 
+ 
+ DbUpdaterJob is updated for sitemap. Detected sitemaps are written to crawldb 
as a new line. Then the sitemaps will be crawled at the new crawl cycle.
+ 
+ = Week : 6 & 7 (29 June 2015 - 12 July 2015) =
+ 
+ '''Title :''' Sitemap parse plugin was abondoned. 
+ 
+ Parser plugin was abandoned after consultation with mentors. The parse 
process was embedded instead of plugin. Sitemap parser will be activated 
according to the parameters given as "sitemap".
+ Also midterm report is prepared. Up to this stage, sitemap life cycle has 
been developed according to the outline. Sitemap crawler runs simply. The 
process until now and from now on have evaluated.
+ 
+ 
+ = Week : 8 (13 July 2015 - 19 July 2015) =
+ 
+ '''Title :''' Sitemap file detection 
+ 
+ Sitemap file detection is implemented. The detection is activated according 
to the parameters given  at instant of fetch.
+ 
+ = Week : 9 (20 July 2015 - 26 July 2015) =
+ 
+ '''Title :''' frequency & priority
+ 
+ Create processSitemapParse function on ParseUtil. Parser process is updated 
for sitemap. Fetch interval time is updated acording to frequency value from 
sitemap.
+ Also priority field is added to crawldb for priority value from sitemap.
+ 
+ 
+ = Week : 10 & 11 (27 July 2015 - 9 August 2015) =
+ 
+ '''Title :''' Review & code cleaning
+ 
+ Some improvements were made according to the review of my mentor. Code 
cleaning is done. Sitemap score logic isn't developed, because current nutch 
score logic is affected. It can be done  according to the evaluation about it 
later.
+ 
+ = Week : 12 (10 August 2015 - 17 August 2015) =
+ 
+ '''Title :''' Testing
+ 
+ Some of problems have been fixed in the nutch test classes. Sitemap Tests 
were prepared.  Documents of sitemap crawler were prepared.
+ 
+ = Week : 13 (18 August 2015 - 21 August 2015) =
+ 
+ '''Title :''' Final evaluation
+ 
+ The final document were prepared.
+ 

Reply via email to