Manu Konchady's book on building search applications is out:
Konchady, Manu. 2008. Building Search Applications: Lucene, LingPipe, and Gate. Mustru Publishing. It's available from Amazon: http://www.amazon.com/Building-Search-Applications-Lucene-Lingpipe/dp/0615204252/ The book's a gentle introduction to enterprise and web search, focusing on the three tools in the title (disclaimer: I wrote most of LingPipe.) The target audience is Java programmers who are new to search and text analytics. As such it provides step-by-step code-based explanations in the mold of a Manning "in Action" book. I read the draft, have a copy of the book right here, and can vouch for its technical accuracy (disclaimer 2: I didn't actually run the code.) It's based on Lucene 2.3. Here's a chapter-by-chapter overview of topics: After (1) a brief discussion of application issues, the chapters include (2) tokenization in all three frameworks, (3) indexing with Lucene, (4) searching with Lucene, (5) sentence extraction, part-of-speech tagging, interesting/significant phrase extraction, and entity extraction with LingPipe and Gate (6) clustering with LingPipe, (7) topic and language classification with LingPipe, (8 ) enterprise and web search, page rank/authority calculation, and crawling with Nutch, (9) tracking news, sentiment analysis with LingPipe, detecting offensive content and plagiarism, and finally, (10) future directions including vertical search, tag-based search and question-answering. That may sound like a whole lot of ground to cover in 400 pages, but Konchady pulls the reader along by illustrating everything with working code and not getting bogged down in technicalities. There are pointers to theory, and a bit of math where necessary, but the book never loses sight of its goal of providing a practical introduction. In that way, it’s like the Manning "in Action" series. About the author: Manu Konchady has a home page/blog on Amazon: http://www.amazon.com/gp/blog/A2TWRNMTU6T9TW/ref=cm_blog_dp_artist_blog - Bob Carpenter Alias-i PS: If you want a theory book on roughly the same selection of topics, check this out: Manning, Raghavan and Schuetze. 2008. Introduction to Information Retrieval. Cambridge University Press. It's due out in July; the content will remain free online in PDF form: http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
