[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection
[ https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625774#comment-16625774 ] Aaron LaBella commented on SOLR-12789: -- Hi Alexandre, thank for you the additional detail and background. While I understand the goal here, I don't agree with how the end result was achieved. I think the real "issue" here was that the examples and documentation are stale. Likewise, UIMA core can (and should) be upgraded to the latest 2.10.2, and the additional unnecessary dependencies should absolutely be removed from the dist. I'm attaching a simple patch (*SOLR-12789-4.patch*) that does just this. I would like to propose that we re-instate the contrib/uima project and apply my patch instead. I think this is a fair compromise since 6 Java classes doesn't quite compromise as "dead weight", especially if those 6 classes provide direct end-user value. While I would certain agree, UIMA has a steep learning curve, there are folks out there that are using it, and removing it entirely from the Solr dist is likely to do a disservice to those folks who are in-fact doing text analytics using it. All that being said, I think the only thing that really remains is a clean-up of the documentation and examples. I'm happy to do that over the next couple weeks if we agree to this strategy. Thanks so much. > UIMA enhancements to allow for dynamic AE detection > --- > > Key: SOLR-12789 > URL: https://issues.apache.org/jira/browse/SOLR-12789 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - UIMA >Affects Versions: 6.0 >Reporter: Aaron LaBella >Priority: Major > Labels: ready-to-commit > Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, > SOLR-12789-3.patch, SOLR-12789-4.patch > > > I've been sitting on this patch for over 2 years (and likewise it's been > running IN production for the same) ... finally got around to contributing it > back to the community. This change prepares the UIMAUpdateRequestProcessor > to allow subclasses to have additional control over how the analysis engine > is selected. In my case, I wrote a sub-class that allows for *dynamic* > detection of the UIMA analysis engine based on the document fields. ie: a > field in the document can be used to select different UIMA configurations and > rules. > > Can someone please commit this as soon as possible. I don't necessarily need > it to be back-ported, having in 7.4.1 would suffice. > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection
[ https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623756#comment-16623756 ] Alexandre Rafalovitch commented on SOLR-12789: -- Hi Aaron, It is great to hear that there would be a healthy discussion about this. Please feel free to share the outcome of this on the developer list and it may spark further developers discussion too. Nothing is set in stone, given enough evidence to the contrary. Still, just to re-summarize, the issue we were facing was that all the shipped examples were dead (Alchemy API...) and over multiple issues we could not figure out a way to get to the new local maximum of latest version and useful examples (UIMA has a bit of a learning curve). Nor were we able to find anybody helping us to push the discussion forward within either development community (Jira discussions) or the user community (Solr Users mailing list). Additionally, we are trying to slim Solr down in general and have done several things towards that, including removing Javadoc from the distribution. If you were more closely connected to the community, you would see multiple of these drives all pointing in the same general direction. So, having a dead weight we could not figure what to do with over several years was very much "not cool" on all those users downloading Solr and trying to navigate their way through very full-featured product. And then, of course, there is a fact that we now incorporate Apache OpenNLP as well. So, there are trade-offs to keep in mind. > UIMA enhancements to allow for dynamic AE detection > --- > > Key: SOLR-12789 > URL: https://issues.apache.org/jira/browse/SOLR-12789 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - UIMA >Affects Versions: 6.0 >Reporter: Aaron LaBella >Priority: Major > Labels: ready-to-commit > Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, > SOLR-12789-3.patch > > > I've been sitting on this patch for over 2 years (and likewise it's been > running IN production for the same) ... finally got around to contributing it > back to the community. This change prepares the UIMAUpdateRequestProcessor > to allow subclasses to have additional control over how the analysis engine > is selected. In my case, I wrote a sub-class that allows for *dynamic* > detection of the UIMA analysis engine based on the document fields. ie: a > field in the document can be used to select different UIMA configurations and > rules. > > Can someone please commit this as soon as possible. I don't necessarily need > it to be back-ported, having in 7.4.1 would suffice. > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection
[ https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623732#comment-16623732 ] Aaron LaBella commented on SOLR-12789: -- Thanks Alexandre ... I added comments in SOLR-11694 (as you already saw). I'll need to discuss with my IBM colleagues and see what we want to do going forward. Like I said, this is a feature we've built production applications on top of and removing it is just plain "not cool". There were other ways it could've been cleaned up and maintained without removing it. As a whole, I'm really disappointed in this decision from the Solr community, as I never heard of pulling out a function that works absolutely fine. The reason that I would say its not "maintained" is that there's nothing wrong with the current feature. If anything, it could be upgraded to a later UIMAJ but even that is easily worked around. I really see this as a big step backwards for Solr with regards to text analytics. > UIMA enhancements to allow for dynamic AE detection > --- > > Key: SOLR-12789 > URL: https://issues.apache.org/jira/browse/SOLR-12789 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - UIMA >Affects Versions: 6.0 >Reporter: Aaron LaBella >Priority: Major > Labels: ready-to-commit > Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, > SOLR-12789-3.patch > > > I've been sitting on this patch for over 2 years (and likewise it's been > running IN production for the same) ... finally got around to contributing it > back to the community. This change prepares the UIMAUpdateRequestProcessor > to allow subclasses to have additional control over how the analysis engine > is selected. In my case, I wrote a sub-class that allows for *dynamic* > detection of the UIMA analysis engine based on the document fields. ie: a > field in the document can be used to select different UIMA configurations and > rules. > > Can someone please commit this as soon as possible. I don't necessarily need > it to be back-ported, having in 7.4.1 would suffice. > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection
[ https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622329#comment-16622329 ] Alexandre Rafalovitch commented on SOLR-12789: -- UIMA has been removed as of Solr 7.5, as it was incredible out of date and the UIMA architecture itself has changed significantly. See SOLR-11694. I am not quite sure there is a viable next step for this. Especially, not as a blocker for a version that is no longer supported beyond security fixes. > UIMA enhancements to allow for dynamic AE detection > --- > > Key: SOLR-12789 > URL: https://issues.apache.org/jira/browse/SOLR-12789 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - UIMA >Affects Versions: 6.0 >Reporter: Aaron LaBella >Priority: Blocker > Labels: ready-to-commit > Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, > SOLR-12789-3.patch > > > I've been sitting on this patch for over 2 years (and likewise it's been > running IN production for the same) ... finally got around to contributing it > back to the community. This change prepares the UIMAUpdateRequestProcessor > to allow subclasses to have additional control over how the analysis engine > is selected. In my case, I wrote a sub-class that allows for *dynamic* > detection of the UIMA analysis engine based on the document fields. ie: a > field in the document can be used to select different UIMA configurations and > rules. > > Can someone please commit this as soon as possible. I don't necessarily need > it to be back-ported, having in 7.4.1 would suffice. > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection
[ https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622309#comment-16622309 ] Aaron LaBella commented on SOLR-12789: -- There is also a patch in here to add proper support for UIMA feature values which are arrays to map to multivalued fields > UIMA enhancements to allow for dynamic AE detection > --- > > Key: SOLR-12789 > URL: https://issues.apache.org/jira/browse/SOLR-12789 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - UIMA >Affects Versions: 6.0 >Reporter: Aaron LaBella >Priority: Blocker > Labels: ready-to-commit > Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, > SOLR-12789-3.patch > > > I've been sitting on this patch for 2 years (and likewise it's been running > production for the same) ... finally got around to contributing it back to > the community. This change prepares the UIMAUpdateRequestProcessor to allow > subclasses to have additional control over hhow the analysis engine is > selected. In my case, I wrote a sub-class that allows for *dynamic* > detection of the UIMA analysis engine based on the document fields. ie: a > field in the document can be used to select different UIMA configurations and > rules. > > Can someone please commit this as soon as possible. > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org