Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-28 Thread Grant Ingersoll


On Oct 28, 2008, at 5:21 PM, markharw00d wrote:



It may be a good summer of code project if someone wanted to  
implement reading and writing data via the GSA feed interface...


Do you mean the Google Summer Of Code initiative? I can't imagine  
Google would be keen to support a project whose goal was to provide  
an open-source, drop-in replacement for one of their commercial  
products :)



Google doesn't vet GSOC applications...  ;-)  Heck, it's even Apache  
licensed...







I can't believe this idea received no replies!


I guess it's just not an itch anyone here feels a particular need to  
scratch right now  - and that is always what is need to get the ball  
rolling.


Maybe another avenue is to approach the commercial providers who  
have already contributed GSA connectors and ask them to consider  
writing a Solr-based consumer endpoint based on the GSA connector  
protocol. They may be commercially incentivised to do this and can  
then claim their products can hook up to either GSA or open-source  
Solr using the same interface.


I think it may be of interest at some point.  I don't have the cycles  
at the moment, but it's definitely something I think makes sense to  
have in Solr.


Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-28 Thread Lukáš Vlček
Hi,
I was thinking about using GSA Connector infrastructure with Nutch or Solr
some time ago because we were considering MS SharePoint search functionality
alternatives incuding GSA.

IMHO this is something that makes sense and I think that open source tools
can beat production alternatives in many ways but also I can see some
issues:

- first and the most difficult: try to talk to your management about
relpacing MS SharePoint or GSA with open source. This conversation can be
very difficult.

- GSA connectors are buggy... try listing through Google group forums (may
be this got better by now).

- I found it is very hard to rely on open source when it comes to parsing of
Microsoft documents in your net (word, excel, power point). It can handle
98% or 99% all your document but not 100% (correct me if I am wrong please).
It should be possible to include some MS document server into the loop but
this make the thing more complicated and requires non-open source
components.

Regards,
Lukas

On Tue, Oct 28, 2008 at 10:21 PM, markharw00d <[EMAIL PROTECTED]>wrote:

>
>  It may be a good summer of code project if someone wanted to implement
>> reading and writing data via the GSA feed interface...
>>
>
> Do you mean the Google Summer Of Code initiative? I can't imagine Google
> would be keen to support a project whose goal was to provide an open-source,
> drop-in replacement for one of their commercial products :)
>
>
>  I can't believe this idea received no replies!
>>>
>>
> I guess it's just not an itch anyone here feels a particular need to
> scratch right now  - and that is always what is need to get the ball
> rolling.
>
> Maybe another avenue is to approach the commercial providers who have
> already contributed GSA connectors and ask them to consider writing a
> Solr-based consumer endpoint based on the GSA connector protocol. They may
> be commercially incentivised to do this and can then claim their products
> can hook up to either GSA or open-source Solr using the same interface.
>
> Cheers
> Mark
>
>
>
>


-- 
http://blog.lukas-vlcek.com/


Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-28 Thread markharw00d


It may be a good summer of code project if someone wanted to implement 
reading and writing data via the GSA feed interface...


Do you mean the Google Summer Of Code initiative? I can't imagine Google 
would be keen to support a project whose goal was to provide an 
open-source, drop-in replacement for one of their commercial products :)



I can't believe this idea received no replies! 


I guess it's just not an itch anyone here feels a particular need to 
scratch right now  - and that is always what is need to get the ball 
rolling.


Maybe another avenue is to approach the commercial providers who have 
already contributed GSA connectors and ask them to consider writing a 
Solr-based consumer endpoint based on the GSA connector protocol. They 
may be commercially incentivised to do this and can then claim their 
products can hook up to either GSA or open-source Solr using the same 
interface.


Cheers
Mark





Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-27 Thread Grant Ingersoll


On Oct 27, 2008, at 5:33 PM, Otis Gospodnetic wrote:


Hi,

I can't believe this idea received no replies!


It did:
http://lucene.markmail.org/message/hb4v3qqbgu5d6dhi?q=documentum  ;-)


 I wonder how many companies are already doing what Mark described  
here. ;)


In any case, I think this is HUGE.  If you believe any of the  
enterprise search pundits, these connectors are essential to get  
Solr to be recognized as "enterprise ready".


Yeah, I think it makes sense.  The more we can do to make it easy for  
Solr to get data in, the better IMO.






- Original Message 

From: mark harwood <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Monday, October 13, 2008 8:24:40 AM
Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity  
by emulating a Google Search Appliance's feed interface


Google open-sourced (Apache license) the framework it uses for  
getting content
from a number of document repositories into a Google Search  
Appliance (their

hardware+software solution for enterprise search).

My suggestion is that Solr could also make use of these connectors  
simply by
opening a port that honours the wire-protocol that is used to feed  
content into

a Google Search Appliance (architecture overview is here:
http://tinyurl.com/4puke8 ). You can see how connectors push data  
in the
"sendData" method in "GsaFeedConnection" in the connector manager  
framework

(source here:  http://tinyurl.com/49cehd ).

Before a connector starts pushing content it needs to be configured  
and the
Google Search Appliance admin screens are used to set this up. The  
GSA appliance
has some form of conversation with connectors to understand what  
properties need
setting and to set them but this again could be added with Solr  
providing an
equivalent admin screen driven by the same information provided by  
connectors.


The other form of conversation conducted is around authentication  
when query

results are about to be presented to users.


This isn't something I have any time to work on but it seems like  
an interesting
project so I thought I'd mention it in case anyone here has the  
time or interest

in pursuing it.
It would open Solr up to some new environments by making use of  
existing

connectors provided by some large commercial organisations.

Cheers,
Mark




--
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-27 Thread Ryan McKinley

Agreed.

Perhaps we could add it to:
http://wiki.apache.org/solr/TaskList

It may be a good summer of code project if someone wanted to implement  
reading and writing data via the GSA feed interface...



On Oct 27, 2008, at 5:33 PM, Otis Gospodnetic wrote:


Hi,

I can't believe this idea received no replies!  I wonder how many  
companies are already doing what Mark described here. ;)


In any case, I think this is HUGE.  If you believe any of the  
enterprise search pundits, these connectors are essential to get  
Solr to be recognized as "enterprise ready".


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: mark harwood <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Monday, October 13, 2008 8:24:40 AM
Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity  
by emulating a Google Search Appliance's feed interface


Google open-sourced (Apache license) the framework it uses for  
getting content
from a number of document repositories into a Google Search  
Appliance (their

hardware+software solution for enterprise search).

My suggestion is that Solr could also make use of these connectors  
simply by
opening a port that honours the wire-protocol that is used to feed  
content into

a Google Search Appliance (architecture overview is here:
http://tinyurl.com/4puke8 ). You can see how connectors push data  
in the
"sendData" method in "GsaFeedConnection" in the connector manager  
framework

(source here:  http://tinyurl.com/49cehd ).

Before a connector starts pushing content it needs to be configured  
and the
Google Search Appliance admin screens are used to set this up. The  
GSA appliance
has some form of conversation with connectors to understand what  
properties need
setting and to set them but this again could be added with Solr  
providing an
equivalent admin screen driven by the same information provided by  
connectors.


The other form of conversation conducted is around authentication  
when query

results are about to be presented to users.


This isn't something I have any time to work on but it seems like  
an interesting
project so I thought I'd mention it in case anyone here has the  
time or interest

in pursuing it.
It would open Solr up to some new environments by making use of  
existing

connectors provided by some large commercial organisations.

Cheers,
Mark






Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-27 Thread Otis Gospodnetic
Hi,

I can't believe this idea received no replies!  I wonder how many companies are 
already doing what Mark described here. ;)

In any case, I think this is HUGE.  If you believe any of the enterprise search 
pundits, these connectors are essential to get Solr to be recognized as 
"enterprise ready".

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: mark harwood <[EMAIL PROTECTED]>
> To: solr-dev@lucene.apache.org
> Sent: Monday, October 13, 2008 8:24:40 AM
> Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by 
> emulating a Google Search Appliance's feed interface
> 
> Google open-sourced (Apache license) the framework it uses for getting 
> content 
> from a number of document repositories into a Google Search Appliance (their 
> hardware+software solution for enterprise search).
> 
> My suggestion is that Solr could also make use of these connectors simply by 
> opening a port that honours the wire-protocol that is used to feed content 
> into 
> a Google Search Appliance (architecture overview is here: 
> http://tinyurl.com/4puke8 ). You can see how connectors push data in the 
> "sendData" method in "GsaFeedConnection" in the connector manager framework 
> (source here:  http://tinyurl.com/49cehd ).
> 
> Before a connector starts pushing content it needs to be configured and the 
> Google Search Appliance admin screens are used to set this up. The GSA 
> appliance 
> has some form of conversation with connectors to understand what properties 
> need 
> setting and to set them but this again could be added with Solr providing an 
> equivalent admin screen driven by the same information provided by connectors.
> 
> The other form of conversation conducted is around authentication when query 
> results are about to be presented to users.
> 
> 
> This isn't something I have any time to work on but it seems like an 
> interesting 
> project so I thought I'd mention it in case anyone here has the time or 
> interest 
> in pursuing it.
> It would open Solr up to some new environments by making use of existing 
> connectors provided by some large commercial organisations.
> 
> Cheers,
> Mark



Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface

2008-10-13 Thread Grant Ingersoll
Sounds like a good idea to me.  The tricky part is in testing, I  
suspect, but maybe the Google code makes it easy to mock that out.


On Oct 13, 2008, at 8:24 AM, mark harwood wrote:

Google open-sourced (Apache license) the framework it uses for  
getting content from a number of document repositories into a Google  
Search Appliance (their hardware+software solution for enterprise  
search).


My suggestion is that Solr could also make use of these connectors  
simply by opening a port that honours the wire-protocol that is used  
to feed content into a Google Search Appliance (architecture  
overview is here: http://tinyurl.com/4puke8 ). You can see how  
connectors push data in the "sendData" method in "GsaFeedConnection"  
in the connector manager framework (source here:  http://tinyurl.com/49cehd 
 ).


Before a connector starts pushing content it needs to be configured  
and the Google Search Appliance admin screens are used to set this  
up. The GSA appliance has some form of conversation with connectors  
to understand what properties need setting and to set them but this  
again could be added with Solr providing an equivalent admin screen  
driven by the same information provided by connectors.


The other form of conversation conducted is around authentication  
when query results are about to be presented to users.



This isn't something I have any time to work on but it seems like an  
interesting project so I thought I'd mention it in case anyone here  
has the time or interest in pursuing it.
It would open Solr up to some new environments by making use of  
existing connectors provided by some large commercial organisations.


Cheers,
Mark