Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
On Oct 28, 2008, at 5:21 PM, markharw00d wrote: It may be a good summer of code project if someone wanted to implement reading and writing data via the GSA feed interface... Do you mean the Google Summer Of Code initiative? I can't imagine Google would be keen to support a project whose goal was to provide an open-source, drop-in replacement for one of their commercial products :) Google doesn't vet GSOC applications... ;-) Heck, it's even Apache licensed... I can't believe this idea received no replies! I guess it's just not an itch anyone here feels a particular need to scratch right now - and that is always what is need to get the ball rolling. Maybe another avenue is to approach the commercial providers who have already contributed GSA connectors and ask them to consider writing a Solr-based consumer endpoint based on the GSA connector protocol. They may be commercially incentivised to do this and can then claim their products can hook up to either GSA or open-source Solr using the same interface. I think it may be of interest at some point. I don't have the cycles at the moment, but it's definitely something I think makes sense to have in Solr.
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
Hi, I was thinking about using GSA Connector infrastructure with Nutch or Solr some time ago because we were considering MS SharePoint search functionality alternatives incuding GSA. IMHO this is something that makes sense and I think that open source tools can beat production alternatives in many ways but also I can see some issues: - first and the most difficult: try to talk to your management about relpacing MS SharePoint or GSA with open source. This conversation can be very difficult. - GSA connectors are buggy... try listing through Google group forums (may be this got better by now). - I found it is very hard to rely on open source when it comes to parsing of Microsoft documents in your net (word, excel, power point). It can handle 98% or 99% all your document but not 100% (correct me if I am wrong please). It should be possible to include some MS document server into the loop but this make the thing more complicated and requires non-open source components. Regards, Lukas On Tue, Oct 28, 2008 at 10:21 PM, markharw00d <[EMAIL PROTECTED]>wrote: > > It may be a good summer of code project if someone wanted to implement >> reading and writing data via the GSA feed interface... >> > > Do you mean the Google Summer Of Code initiative? I can't imagine Google > would be keen to support a project whose goal was to provide an open-source, > drop-in replacement for one of their commercial products :) > > > I can't believe this idea received no replies! >>> >> > I guess it's just not an itch anyone here feels a particular need to > scratch right now - and that is always what is need to get the ball > rolling. > > Maybe another avenue is to approach the commercial providers who have > already contributed GSA connectors and ask them to consider writing a > Solr-based consumer endpoint based on the GSA connector protocol. They may > be commercially incentivised to do this and can then claim their products > can hook up to either GSA or open-source Solr using the same interface. > > Cheers > Mark > > > > -- http://blog.lukas-vlcek.com/
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
It may be a good summer of code project if someone wanted to implement reading and writing data via the GSA feed interface... Do you mean the Google Summer Of Code initiative? I can't imagine Google would be keen to support a project whose goal was to provide an open-source, drop-in replacement for one of their commercial products :) I can't believe this idea received no replies! I guess it's just not an itch anyone here feels a particular need to scratch right now - and that is always what is need to get the ball rolling. Maybe another avenue is to approach the commercial providers who have already contributed GSA connectors and ask them to consider writing a Solr-based consumer endpoint based on the GSA connector protocol. They may be commercially incentivised to do this and can then claim their products can hook up to either GSA or open-source Solr using the same interface. Cheers Mark
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
On Oct 27, 2008, at 5:33 PM, Otis Gospodnetic wrote: Hi, I can't believe this idea received no replies! It did: http://lucene.markmail.org/message/hb4v3qqbgu5d6dhi?q=documentum ;-) I wonder how many companies are already doing what Mark described here. ;) In any case, I think this is HUGE. If you believe any of the enterprise search pundits, these connectors are essential to get Solr to be recognized as "enterprise ready". Yeah, I think it makes sense. The more we can do to make it easy for Solr to get data in, the better IMO. - Original Message From: mark harwood <[EMAIL PROTECTED]> To: solr-dev@lucene.apache.org Sent: Monday, October 13, 2008 8:24:40 AM Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface Google open-sourced (Apache license) the framework it uses for getting content from a number of document repositories into a Google Search Appliance (their hardware+software solution for enterprise search). My suggestion is that Solr could also make use of these connectors simply by opening a port that honours the wire-protocol that is used to feed content into a Google Search Appliance (architecture overview is here: http://tinyurl.com/4puke8 ). You can see how connectors push data in the "sendData" method in "GsaFeedConnection" in the connector manager framework (source here: http://tinyurl.com/49cehd ). Before a connector starts pushing content it needs to be configured and the Google Search Appliance admin screens are used to set this up. The GSA appliance has some form of conversation with connectors to understand what properties need setting and to set them but this again could be added with Solr providing an equivalent admin screen driven by the same information provided by connectors. The other form of conversation conducted is around authentication when query results are about to be presented to users. This isn't something I have any time to work on but it seems like an interesting project so I thought I'd mention it in case anyone here has the time or interest in pursuing it. It would open Solr up to some new environments by making use of existing connectors provided by some large commercial organisations. Cheers, Mark -- Grant Ingersoll Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans. http://www.lucenebootcamp.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
Agreed. Perhaps we could add it to: http://wiki.apache.org/solr/TaskList It may be a good summer of code project if someone wanted to implement reading and writing data via the GSA feed interface... On Oct 27, 2008, at 5:33 PM, Otis Gospodnetic wrote: Hi, I can't believe this idea received no replies! I wonder how many companies are already doing what Mark described here. ;) In any case, I think this is HUGE. If you believe any of the enterprise search pundits, these connectors are essential to get Solr to be recognized as "enterprise ready". Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: mark harwood <[EMAIL PROTECTED]> To: solr-dev@lucene.apache.org Sent: Monday, October 13, 2008 8:24:40 AM Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface Google open-sourced (Apache license) the framework it uses for getting content from a number of document repositories into a Google Search Appliance (their hardware+software solution for enterprise search). My suggestion is that Solr could also make use of these connectors simply by opening a port that honours the wire-protocol that is used to feed content into a Google Search Appliance (architecture overview is here: http://tinyurl.com/4puke8 ). You can see how connectors push data in the "sendData" method in "GsaFeedConnection" in the connector manager framework (source here: http://tinyurl.com/49cehd ). Before a connector starts pushing content it needs to be configured and the Google Search Appliance admin screens are used to set this up. The GSA appliance has some form of conversation with connectors to understand what properties need setting and to set them but this again could be added with Solr providing an equivalent admin screen driven by the same information provided by connectors. The other form of conversation conducted is around authentication when query results are about to be presented to users. This isn't something I have any time to work on but it seems like an interesting project so I thought I'd mention it in case anyone here has the time or interest in pursuing it. It would open Solr up to some new environments by making use of existing connectors provided by some large commercial organisations. Cheers, Mark
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
Hi, I can't believe this idea received no replies! I wonder how many companies are already doing what Mark described here. ;) In any case, I think this is HUGE. If you believe any of the enterprise search pundits, these connectors are essential to get Solr to be recognized as "enterprise ready". Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: mark harwood <[EMAIL PROTECTED]> > To: solr-dev@lucene.apache.org > Sent: Monday, October 13, 2008 8:24:40 AM > Subject: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by > emulating a Google Search Appliance's feed interface > > Google open-sourced (Apache license) the framework it uses for getting > content > from a number of document repositories into a Google Search Appliance (their > hardware+software solution for enterprise search). > > My suggestion is that Solr could also make use of these connectors simply by > opening a port that honours the wire-protocol that is used to feed content > into > a Google Search Appliance (architecture overview is here: > http://tinyurl.com/4puke8 ). You can see how connectors push data in the > "sendData" method in "GsaFeedConnection" in the connector manager framework > (source here: http://tinyurl.com/49cehd ). > > Before a connector starts pushing content it needs to be configured and the > Google Search Appliance admin screens are used to set this up. The GSA > appliance > has some form of conversation with connectors to understand what properties > need > setting and to set them but this again could be added with Solr providing an > equivalent admin screen driven by the same information provided by connectors. > > The other form of conversation conducted is around authentication when query > results are about to be presented to users. > > > This isn't something I have any time to work on but it seems like an > interesting > project so I thought I'd mention it in case anyone here has the time or > interest > in pursuing it. > It would open Solr up to some new environments by making use of existing > connectors provided by some large commercial organisations. > > Cheers, > Mark
Re: Idea: Add Documentum/Sharepoint/FileNet etc connectivity by emulating a Google Search Appliance's feed interface
Sounds like a good idea to me. The tricky part is in testing, I suspect, but maybe the Google code makes it easy to mock that out. On Oct 13, 2008, at 8:24 AM, mark harwood wrote: Google open-sourced (Apache license) the framework it uses for getting content from a number of document repositories into a Google Search Appliance (their hardware+software solution for enterprise search). My suggestion is that Solr could also make use of these connectors simply by opening a port that honours the wire-protocol that is used to feed content into a Google Search Appliance (architecture overview is here: http://tinyurl.com/4puke8 ). You can see how connectors push data in the "sendData" method in "GsaFeedConnection" in the connector manager framework (source here: http://tinyurl.com/49cehd ). Before a connector starts pushing content it needs to be configured and the Google Search Appliance admin screens are used to set this up. The GSA appliance has some form of conversation with connectors to understand what properties need setting and to set them but this again could be added with Solr providing an equivalent admin screen driven by the same information provided by connectors. The other form of conversation conducted is around authentication when query results are about to be presented to users. This isn't something I have any time to work on but it seems like an interesting project so I thought I'd mention it in case anyone here has the time or interest in pursuing it. It would open Solr up to some new environments by making use of existing connectors provided by some large commercial organisations. Cheers, Mark