Hello Sergio. I'm noticing a custom similarity class in the schema excerpt. I'm not sure how it's related to the problem. Can you check debugQuery=true and check explanations for matches, which are scored against expectation?
On Fri, Feb 24, 2023 at 1:31 PM Sergio García Maroto <[email protected]> wrote: > Hi all, > > I am dealing with an scoring challenge I am not fully sure how to solve. > I have the following field which contains a list of job records in a > multivalue field. All data is concatenated as below > Problem is I am getting a smaller record with less jobs first due to > scoring penalizing the biggest records even they have more times the tokens > to be matched. > I used omitnorms=true and then I am getting same score. > > Question is? > Can i get more score for the record with more tokens changing something? > > Query: > ((SmartSearchS:"main director [$CU] [$PRJ] "~100)^5 OR > ((SmartSearchS:(main) AND (SmartSearchS:(director*))) OR > ((SmartSearchS:("main director")))^3)) > > <doc> > <str name="PersonIDDoc">-730007</str> <arr name="JobSearch"> <str>[$CU] > [$PRJ] main director Ucona corporation Illinois Midwest(USA) United States > of America North America</str> </arr> <doc> <str name="PersonIDDoc">-730008 > </str> <arr name="JobSearch"> <str>[$CU] [$PRJ] main director Ucona > corporation Illinois Midwest(USA) United States of America North America</ > str> <str>[$CU] [$PRJ] main director Ucona corporation Illinois > Midwest(USA) United States of America North America</str> </arr> </doc> > > This the field and type > <field name="JobSearch" type="JobSearchField" indexed="true" stored="true" > multiValued="true" /> > > <fieldType name="JobSearchField" class="solr.TextField" > positionIncrementGap > ="500"> <analyzer type="index"> <tokenizer class= > "solr.WhitespaceTokenizerFactory"/> <filter class= > "solr.WordDelimiterGraphFilterFactory" generateWordParts="0" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll= > "1" splitOnCaseChange="0" preserveOriginal="0" protected="protwords.txt"/> > < > filter class="solr.ASCIIFoldingFilterFactory"/> <filter class= > "solr.LowerCaseFilterFactory"/> <filter class= > "solr.SynonymGraphFilterFactory" synonyms="positionsynonyms.txt" > ignoreCase= > "true" expand="true"/> <filter class="solr.SynonymGraphFilterFactory" > synonyms="locationsynonyms.txt" ignoreCase="true" expand="true"/> <filter > class="solr.FlattenGraphFilterFactory"/> </analyzer> <analyzer > type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class= > "solr.WordDelimiterGraphFilterFactory" generateWordParts="0" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll= > "1" splitOnCaseChange="0" preserveOriginal="0" protected="protwords.txt"/> > < > filter class="solr.ASCIIFoldingFilterFactory"/> <filter class= > "solr.LowerCaseFilterFactory"/> </analyzer> <similarity class= > "com.spencerstuart.similarities.SpencerStuartNoSimilarity"></similarity> </ > fieldType> > > > Thanks a lot > Sergio Maroto > -- Sincerely yours Mikhail Khludnev https://t.me/MUST_SEARCH A caveat: Cyrillic!
