Hi Jörn, 2010/9/22 Jörn Kottmann <[email protected]>
> The Solr people told me at last years ApacheCon that > they are interested in a collaboration with UIMA. > > Here is an interesting thread about how the integration > could be done: > http://www.mail-archive.com/[email protected]/msg24007.html > > I think the author is right that posting to a Solr server from the end > of a analysis pipeline might be the best option to integrate UIMA, > it will also work nicely with UIMA AS where analysis pipeline can > run on a number of different machines. > > Here is the link to Tommasos post at the Solr ML: > http://lucene.472066.n3.nabble.com/Solr-UIMA-integration-td1528253.html > > Jörn > I see very interesting points and suggestions and it's good to see interest around where Solr and UIMA could meet. Let me say that in my opinion there are two feasible perspectives of integration, being the first one from UIMA point of view and the latter from Solr point of view. A UIMA perspective would consider Solr like some sort of storing platform (let's call it Use Case 1), so, in terms of communication between the two systems, it's UIMA "asking" Solr to consume one or more CASs (probably defining a Solr CAS Consumer as suggested in Jörn's link) and I agree it would fit very well also with UIMA AS. >From Solr point of view (Use Case 2), a UIMA system can be queried to process some document while Solr is consuming it for indexing (an update processors chain); so it's Solr asking UIMA to "enrich" its document; maybe UIMA AS would fit well here too, I would put this point on the TODO list :-) My proposal tries to consider this second scenario (Solr making a request to UIMA) but I think it would be great to have both of this use cases implemented and I would be happy to work on UC1 too (UIMA making an indexing requesto to Solr via a CAS Consumer). 2010/9/23 Vaijanath Rao <[email protected]> > Though I am not an expert on either of the two components UIMA/SOLR, I > would > like to to assist you with the integration of UIMA with SOLR. Thanks Vaijanath, any hint from you, both on ideas or code, will be more than welcome. I created an issue on Solr Jira regarding the UC2 integration proposal [1] and I am preparing the patch to attach so if you have any objections or suggestions, again, it would be nice to hear them. Regards, Tommaso [1] : https://issues.apache.org/jira/browse/SOLR-2129 p.s.: as I side note, since maybe it's not a main concern at this time, I think that UC2 integration fit better as a Solr module whereas UC1 (a SolrCASConsumer) would fit better as a project inside the UIMA Sandbox
