I plan to make lucene (and nutch) a key element in an intranet solution, but I only know about lucene what I've read in the last couple of days. Here's what I'd like opinions about.
I would like to build a single point of access to data on intranet web pages and LAN shared documents. I've looked into the file:// vs. http:// issue and it seems that it doesn't make sense securitywise to use file:// so I'd make the shares accessible through a web server and *then* index them. However, if I leverage the web inbuilt lucene search, I'd build two "indices" (please pardon the lack of knowledge of lucene terminology). Can I search over two indices? Do I have to merge them? How would I do the former or the latter? Is there a good explanation somewhere how to set up incremental indexing, rather than e.g. building the whole index over nightly? Will lucene produce usable results, knowing that there'll be no rich interlinking between a major part of the content (shared docs) to base ranking upon? I know lucene uses a number of criteria to rank documents upon: I'm asking about real-world experiences and subjective grading quality impressions. Does anyone have similar experiences with intranet enterprise data searching? Cheers, Tomi --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]