Jason Polites wrote:

I think everyone agrees that this would be a very neat application of opensource technology like Lucene... however (opens drawer, pulls out devil's advocate hat, places on head)... there are several complexities here not addressed by Lucene (et. al). Not because Lucene isn't damn fantastic, just because it's not its job.

One of the big ones is security. Enterprise search is no good if it doesn't match up with the authentication and authorization paradigms existing in the organisation. How useful is it to return a whole bunch of search results for documents to which you don't have access? Not to mention the issues around whether you are even authorized to know it exists.

I was gonna mention this - you beat me to the punch. I suspect that LDAP/JNDI itegration is a start, but you need hooks for an arbitrary auth plugin. And once we address this it might be the case that a user has to *log in* to the search server. We have Verity where I work and this is all the case, along w/ the fact that a sale seems to involve mandatory consulting work (not that that's bad, but if you're trying to ship a shrink wrapped search engine in a box then this is an issue).



The other prickly one is file types. It's all well and good to index HTML, XML and text but when you start looking at PDF, MS Office (OLE docs, PSTs, Outlook MSG files, MS Project files etc), Lotus Notes databases etc etc, things begin to look less simple and far less elegant than a nice clean lucene rackmount. Sure there are great projects like Apache POI but they are still have a bit of a way to go before they mature to a point of really solving these problems. After which time Microsoft will probably be rolling out Longhorn and everyone may need to start from scratch.

Also need http://jcifs.samba.org/ so you can spider windows file shares.


This is not to say that it's not a great idea, but as with most great ideas the challenge is not the formation of the idea, but its implementation.

Indeed.


I think a great first step would be to start developing good, reliable, opensource extensions to Lucene which strive to solve some of these issues.


end rant.

----- Original Message ----- From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, January 28, 2005 12:40 PM
Subject: Re: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there



I discuss this with myself a lot.... inside my head... :)
Seriously, I agree with Erik.  I think this is a business opportunity.
How many people are hating me now and going "shhhhhh"?  Raise your
hands!

Otis

--- David Spencer <[EMAIL PROTECTED]> wrote:

This reminds me, has anyone every discussed something similar:

- rackmount server ( or for coolness factor, that mini mac)
- web i/f for config/control

- of course the server would have the following s/w:
-- web server
-- lucene / nutch

Part of the work here I think is having a decent web i/f to configure

the thing and to customize the L&F of the search results.



jian chen wrote:
> Hi,
>
> I was searching using google and just found that there was a new
> feature called "google mini". Initially I thought it was another
free
> service for small companies. Then I realized that it costs quite
some
> money ($4,995) for the hardware and software. (I guess the
proprietary
> software costs a whole lot more than actual hardware.)
>
> The "nice" feature is that, you can only index up to 50,000
documents
> with this price. If you need to index more, sorry, send in the
> check...
>
> It seems to me that any small biz will be ripped off if they
install
> this google mini thing, compared to using Lucene to implement a
easy
> to use search software, which could search up to whatever number of
> documents you could image.
>
> I hope the lucene project could get exposed more to the enterprise
so
> that people know that they have not only cheaper but more
importantly,
> BETTER alternatives.
>
> Jian
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail:
[EMAIL PROTECTED]
>


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to