Grant, I'm not a java developer but a sysadmin and I've been struggling for a couple of month now to build a full web search engine stack based on hadoop + nutch + solr .
I don't know much about the documentation for developers so I trust you if you say it's good. What I do know is that I found good docs for the very first steps (installing and performing simple "single-run" crawling and indexing with near-default configurations) but I'm now facing a great lack of information about features useful in production scenarios. Now I need to dig deeper into how data are managed, into the workflow and all the features that are needed in real world and i find that the documentation is little, rather confused, incomplete, often quite old and spreaded into too many disconnected pieces. Stumbling around on the Net I discovered I'm not alone, actually. I'm experiencing many difficulties in understanding and implementing even quite basic features such as consistent incremental recrawling/reindexing, adding custom fields, data parsing, duplicate detection, automatic removal of old indexed documents based on insertion date and so on. I mean, I would like a more organic set of use cases suited for real-world scenarios (as starting points) and some in-depth explanation of exactly how the data "flow" from the crawler into the complex structure of Solr and how it is handled by the different components of the stack. (Nutch documentation is probably even worse, but this is not the right place to complain about that). S ---------------------------------- "Anyone proposing to run Windows on servers should be prepared to explain what they know about servers that Google, Yahoo, and Amazon don't." Paul Graham "A mathematician is a device for turning coffee into theorems." Paul Erdos (who obviously never met a sysadmin) ----- Messaggio originale ----- > Da: Grant Ingersoll <gsing...@apache.org> > A: solr-user@lucene.apache.org > Inviato: Mer 24 febbraio 2010, 18:54:32 > Oggetto: Re: If you could have one feature in Solr... > > > On Feb 24, 2010, at 11:08 AM, Stefano Cherchi wrote: > > > Decent documentation. > > What parts do you feel are lacking? Or is it just across the board? Wikis > are > both good and bad for documentation, IMO. > > -Grant