[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection

2018-09-24 Thread Aaron LaBella (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625774#comment-16625774
 ] 

Aaron LaBella commented on SOLR-12789:
--

Hi Alexandre, thank for you the additional detail and background.  While I 
understand the goal here, I don't agree with how the end result was achieved.  
I think the real "issue" here was that the examples and documentation are 
stale.  Likewise, UIMA core can (and should) be upgraded to the latest 2.10.2, 
and the additional unnecessary dependencies should absolutely be removed from 
the dist.  I'm attaching a simple patch (*SOLR-12789-4.patch*) that does just 
this.  I would like to propose that we re-instate the contrib/uima project and 
apply my patch instead.  I think this is a fair compromise since 6 Java classes 
doesn't quite compromise as "dead weight", especially if those 6 classes 
provide direct end-user value.  While I would certain agree, UIMA has a steep 
learning curve, there are folks out there that are using it, and removing it 
entirely from the Solr dist is likely to do a disservice to those folks who are 
in-fact doing text analytics using it.

 

All that being said, I think the only thing that really remains is a clean-up 
of the documentation and examples.  I'm happy to do that over the next couple 
weeks if we agree to this strategy.

 

Thanks so much.

 

> UIMA enhancements to allow for dynamic AE detection
> ---
>
> Key: SOLR-12789
> URL: https://issues.apache.org/jira/browse/SOLR-12789
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - UIMA
>Affects Versions: 6.0
>Reporter: Aaron LaBella
>Priority: Major
>  Labels: ready-to-commit
> Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, 
> SOLR-12789-3.patch, SOLR-12789-4.patch
>
>
> I've been sitting on this patch for over 2 years (and likewise it's been 
> running IN production for the same) ... finally got around to contributing it 
> back to the community.  This change prepares the UIMAUpdateRequestProcessor 
> to allow subclasses to have additional control over how the analysis engine 
> is selected.  In my case, I wrote a sub-class that allows for *dynamic* 
> detection of the UIMA analysis engine based on the document fields.  ie: a 
> field in the document can be used to select different UIMA configurations and 
> rules.
>  
> Can someone please commit this as soon as possible.  I don't necessarily need 
> it to be back-ported, having in 7.4.1 would suffice.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection

2018-09-21 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623756#comment-16623756
 ] 

Alexandre Rafalovitch commented on SOLR-12789:
--

Hi Aaron,

It is great to hear that there would be a healthy discussion about this. Please 
feel free to share the outcome of this on the developer list and it may spark 
further developers discussion too. Nothing is set in stone, given enough 
evidence to the contrary.

Still, just to re-summarize, the issue we were facing was that all the shipped 
examples were dead (Alchemy API...) and over multiple issues we could not 
figure out a way to get to the new local maximum of latest version and useful 
examples (UIMA has a bit of a learning curve). Nor were we able to find anybody 
helping us to push the discussion forward within either development community 
(Jira discussions) or the user community (Solr Users mailing list).

Additionally, we are trying to slim Solr down in general and have done several 
things towards that, including removing Javadoc from the distribution. If you 
were more closely connected to the community, you would see multiple of these 
drives all pointing in the same general direction. So, having a dead weight we 
could not figure what to do with over several years was very much "not cool" on 
all those users downloading Solr and trying to navigate their way through very 
full-featured product. 

And then, of course, there is a fact that we now incorporate Apache OpenNLP as 
well. So, there are trade-offs to keep in mind.

 

> UIMA enhancements to allow for dynamic AE detection
> ---
>
> Key: SOLR-12789
> URL: https://issues.apache.org/jira/browse/SOLR-12789
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - UIMA
>Affects Versions: 6.0
>Reporter: Aaron LaBella
>Priority: Major
>  Labels: ready-to-commit
> Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, 
> SOLR-12789-3.patch
>
>
> I've been sitting on this patch for over 2 years (and likewise it's been 
> running IN production for the same) ... finally got around to contributing it 
> back to the community.  This change prepares the UIMAUpdateRequestProcessor 
> to allow subclasses to have additional control over how the analysis engine 
> is selected.  In my case, I wrote a sub-class that allows for *dynamic* 
> detection of the UIMA analysis engine based on the document fields.  ie: a 
> field in the document can be used to select different UIMA configurations and 
> rules.
>  
> Can someone please commit this as soon as possible.  I don't necessarily need 
> it to be back-ported, having in 7.4.1 would suffice.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection

2018-09-21 Thread Aaron LaBella (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623732#comment-16623732
 ] 

Aaron LaBella commented on SOLR-12789:
--

Thanks Alexandre ... I added comments in SOLR-11694 (as you already saw).  I'll 
need to discuss with my IBM colleagues and see what we want to do going 
forward.  Like I said, this is a feature we've built production applications on 
top of and removing it is just plain "not cool".  There were other ways it 
could've been cleaned up and maintained without removing it.

As a whole, I'm really disappointed in this decision from the Solr community, 
as I never heard of pulling out a function that works absolutely fine.  The 
reason that I would say its not "maintained" is that there's nothing wrong with 
the current feature.  If anything, it could be upgraded to a later UIMAJ but 
even that is easily worked around.  I really see this as a big step backwards 
for Solr with regards to text analytics.

> UIMA enhancements to allow for dynamic AE detection
> ---
>
> Key: SOLR-12789
> URL: https://issues.apache.org/jira/browse/SOLR-12789
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - UIMA
>Affects Versions: 6.0
>Reporter: Aaron LaBella
>Priority: Major
>  Labels: ready-to-commit
> Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, 
> SOLR-12789-3.patch
>
>
> I've been sitting on this patch for over 2 years (and likewise it's been 
> running IN production for the same) ... finally got around to contributing it 
> back to the community.  This change prepares the UIMAUpdateRequestProcessor 
> to allow subclasses to have additional control over how the analysis engine 
> is selected.  In my case, I wrote a sub-class that allows for *dynamic* 
> detection of the UIMA analysis engine based on the document fields.  ie: a 
> field in the document can be used to select different UIMA configurations and 
> rules.
>  
> Can someone please commit this as soon as possible.  I don't necessarily need 
> it to be back-ported, having in 7.4.1 would suffice.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection

2018-09-20 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622329#comment-16622329
 ] 

Alexandre Rafalovitch commented on SOLR-12789:
--

UIMA has been removed as of Solr 7.5, as it was incredible out of date and the 
UIMA architecture itself has changed significantly. See SOLR-11694. 

I am not quite sure there is a viable next step for this. Especially, not as a 
blocker for a version that is no longer supported beyond security fixes.

> UIMA enhancements to allow for dynamic AE detection
> ---
>
> Key: SOLR-12789
> URL: https://issues.apache.org/jira/browse/SOLR-12789
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - UIMA
>Affects Versions: 6.0
>Reporter: Aaron LaBella
>Priority: Blocker
>  Labels: ready-to-commit
> Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, 
> SOLR-12789-3.patch
>
>
> I've been sitting on this patch for over 2 years (and likewise it's been 
> running IN production for the same) ... finally got around to contributing it 
> back to the community.  This change prepares the UIMAUpdateRequestProcessor 
> to allow subclasses to have additional control over how the analysis engine 
> is selected.  In my case, I wrote a sub-class that allows for *dynamic* 
> detection of the UIMA analysis engine based on the document fields.  ie: a 
> field in the document can be used to select different UIMA configurations and 
> rules.
>  
> Can someone please commit this as soon as possible.  I don't necessarily need 
> it to be back-ported, having in 7.4.1 would suffice.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12789) UIMA enhancements to allow for dynamic AE detection

2018-09-20 Thread Aaron LaBella (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622309#comment-16622309
 ] 

Aaron LaBella commented on SOLR-12789:
--

There is also a patch in here to add proper support for UIMA feature values 
which are arrays to map to multivalued fields

> UIMA enhancements to allow for dynamic AE detection
> ---
>
> Key: SOLR-12789
> URL: https://issues.apache.org/jira/browse/SOLR-12789
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - UIMA
>Affects Versions: 6.0
>Reporter: Aaron LaBella
>Priority: Blocker
>  Labels: ready-to-commit
> Attachments: SOLR-12789-1.patch, SOLR-12789-2.patch, 
> SOLR-12789-3.patch
>
>
> I've been sitting on this patch for 2 years (and likewise it's been running 
> production for the same) ... finally got around to contributing it back to 
> the community.  This change prepares the UIMAUpdateRequestProcessor to allow 
> subclasses to have additional control over hhow the analysis engine is 
> selected.  In my case, I wrote a sub-class that allows for *dynamic* 
> detection of the UIMA analysis engine based on the document fields.  ie: a 
> field in the document can be used to select different UIMA configurations and 
> rules.
>  
> Can someone please commit this as soon as possible.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org