Re: nutch slide in lucene presentation

2007-03-24 Thread Sami Siren
Yonik Seeley wrote:
 I'd appreciate any suggestions on what to present for Nutch... here is
 my current slide:
 

some minor suggestions (addition of some jargon or buzzwords):

Nutch
* Highly customizable Open source web search application
* Crawlers
* Link-graph database
* Document parsers (HTML, word, pdf, etc)
* Language + charset detection, MicroFormats Rel-Tag,
  OpenSearch 1.0, Creative Commons
* Utilizes Hadoop (DFS + MapReduce) for massive scalability yet
  maintaining a good performance on single server

Also it would be nice to mention the availability of the 0.9.0 (if it is
out by then ;)

--
 Sami Siren


Re: Issues pending before 0.9 release

2007-03-24 Thread Andrzej Bialecki

Sami Siren wrote:



Let's make it the best release ever! :)


I have a good feeling about this one. There's some nice marketing
material about crawling efficiency [1]. I should probably extend
benching to indexing and searching too.

[1] http://blog.foofactory.fi/2007/03/twice-speed-half-size.html


Yes, I saw this - great stuff :)


--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com