[ 
https://issues.apache.org/jira/browse/SOLR-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175189#comment-16175189
 ] 

Christine Poerschke commented on SOLR-11386:
--------------------------------------------


Hi Michael,

Thanks for reporting this strange behaviour! Saw your post on the mailing list 
yesterday but was unable to try out the 
http://gss-test-fusion.usersys.redhat.com:8983 links you shared (and perhaps am 
not meant to be able to try them anyhow).

Everyone is investigating things differently of course, but here's a few things 
I might try (and perhaps you've already tried them too), in more or less this 
order:

* The gist for the broken scenario shows
{code}
...
... efi.case_description=added couple of fiber channel efi.case_issue= ...
...
{code}
whereas the gist for the working scenario shows
{code}
...
... efi.case_description=couple of fiber channel added  efi.case_issue= ...
...
{code}
with the latter have an extra space before the {{efi.case_issue=}} - just 
curious if that might be relevant or not.

* It seems that perhaps the {{efi.case_description}} being multi-term has 
something to do with it.
** If it's just one term does it work then (I'd guess so) and is it the 
addition of the second or subsequent terms that results in the strange 
behaviour?
** Do the terms matter maybe? I'd guess not but worth just trying out.
** The {{efi.case_description}} is accompanied by {{efi.case_summary}} and 
{{efi.case_issue}} and {{efi.case_environment}} - and that should work and the 
order should not matter - but it might be worth exploring if changing of the 
order ( or removing of the accompanying {{efi}} s) brings any insights.
*** Specifically I'm wondering if the {{efi.case_description}} 'grabbed' the 
subsequent {{efi}} s (but did not grab them when there was the extra space) and 
then the subsequent {{efi}} s being missing causes the strange behavior. _(If 
that is so then obviously it would be helpful to receive a better error message 
or something from Solr.)_

* Could the example be transferred to and reproduced with Solr's techproducts 
example. 
https://lucene.apache.org/solr/guide/6_6/learning-to-rank.html#LearningToRank-QuickStartExample

* Addition of a test case e.g. in TestExternalFeatures.java to reproduce the 
strange behavior.

> Extracting learning to rank features fails when word ordering of EFI argument 
> changed.
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-11386
>                 URL: https://issues.apache.org/jira/browse/SOLR-11386
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - LTR
>    Affects Versions: 6.5.1
>            Reporter: Michael A. Alcorn
>
> I'm getting some extremely strange behavior when trying to extract features 
> for a learning to rank model. The following query incorrectly says all 
> features have zero values:
> http://gss-test-fusion.usersys.redhat.com:8983/solr/access/query?q=added 
> couple of fiber channel&rq={!ltr model=redhat_efi_model reRankDocs=1 
> efi.case_summary=the efi.case_description=added couple of fiber channel 
> efi.case_issue=the efi.case_environment=the}&fl=id,score,[features]&rows=10
> But this query, which simply moves the word "added" from the front of the 
> provided text to the back, properly fills in the feature values:
> http://gss-test-fusion.usersys.redhat.com:8983/solr/access/query?q=couple of 
> fiber channel added&rq={!ltr model=redhat_efi_model reRankDocs=1 
> efi.case_summary=the efi.case_description=couple of fiber channel added 
> efi.case_issue=the efi.case_environment=the}&fl=id,score,[features]&rows=10
> The explain output for the failing query can be found here:
> https://gist.github.com/manisnesan/18a8f1804f29b1b62ebfae1211f38cc4
> and the explain output for the properly functioning query can be found here:
> https://gist.github.com/manisnesan/47685a561605e2229434b38aed11cc65



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to