Hello list.

After discussing the thread "LanguageIdentifierUpdateProcessor uses only 
firstValue() on multivalued fields" on solr-user,
I like to propose a patch to add the following feature:

LanguageIdentifierUpdateProcessor should use all (String) values of a 
multivalued field for language detection.

By now, the LUP imlicitely only retieves the first-value of a multivalued field.
This leads to omitting any other values of such field. Furthermore, if for some 
reason, the first-value is not a String but following values would be Strings, 
there's no language detection at all for such a multi-valued field.

I propose this patch here, following your contribution guidelines. 
It is unclear to me if this scenario was just overlooked or if this was a 
conscious design decission.

So, let me hear what you think of this feature. Maybe you are already working 
on it.
If not, I'm eager to file my (probably first) feature request and patch on 
JIRA. 
I have a working trunk checkout in IDEA setup on OSX and "ant clean install" 
claims "SUCCESS".


Looking forward to hear from you!

Regards,
Stephan - srm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to