Jian, I disagree that the Google Mini is useless. $5000 is quite inexpensive for a commercial search engine. I know of search engines where the cost is practically 20 cents per document. Heck, a decent server capable of running a heavily loaded search engine costs $3000. Also, don't forget you get two years of hardware replacements in case of failure and software updates for that $5000.
Lucene is a great concept, a great effort, and a great indexer. But it doesn't have a built-in spider; you need to provide that or use add-on code or a system like Nutch which, according to http://www.nutch.org/docs/en/developers.html, only parses HTML. You can't process PDFs, MS Office-type docs, and others without still more code. At this point, for a small company, probably with an over-worked IT staff, Lucene would be just too time consuming to use. The other thing to consider is the use of the search engine. Is it for a company's public website? Is it to index every document that the company produces? Will it index databases? Email? Lotus Notes apps? For a simple website search engine, Google Mini and other appliances are hard to beat. The 50,000 document limit: 50,000 documents is a lot for a small company website. Cheers, bill -----Original Message----- From: jian chen [mailto:[EMAIL PROTECTED] Sent: Thursday, January 27, 2005 11:06 PM To: Lucene Users List Subject: Re: google mini? who needs it when Lucene is there Overall, even if google mini gives a lot of cool features compared to a bare-born lucene project, what is good with the 50,000 documents limit. It is useless with that limit. That is just their way of trying to turn it into another cash cow. Jian On Thu, 27 Jan 2005 17:45:03 -0800 (PST), Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > 500 times the original data? Not true! :) > > Otis > > --- "Xiaohong Yang (Sharon)" <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I agree that Google mini is quite expensive. It might be similar to > > the desktop version in quality. Anyone knows google's ratio of index > > to text? Is it true that Lucene's index is about 500 times the > > original text size (not including image size)? I don't have one > > installed, so I cannot measure. > > > > Best, > > > > Sharon > > > > jian chen <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I was searching using google and just found that there was a new > > feature called "google mini". Initially I thought it was another > > free service for small companies. Then I realized that it costs > > quite some money ($4,995) for the hardware and software. (I guess > > the proprietary software costs a whole lot more than actual > > hardware.) > > > > The "nice" feature is that, you can only index up to 50,000 > > documents with this price. If you need to index more, sorry, send in > > the check... > > > > It seems to me that any small biz will be ripped off if they install > > this google mini thing, compared to using Lucene to implement a easy > > to use search software, which could search up to whatever number of > > documents you could image. > > > > I hope the lucene project could get exposed more to the enterprise > > so that people know that they have not only cheaper but more > > importantly, BETTER alternatives. > > > > Jian > > > > -------------------------------------------------------------------- > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] CONFIDENTIALITY NOTICE: This E-Mail is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If you have received this communication in error, please do not distribute and delete the original message. Please notify the sender by E-Mail at the address shown. Thank you for your compliance. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]