
Thanks a lot for the suggestion. We now implemented the query as
 q=(+geschichte +rom) OR _query_:{!boost b=0.01}{!join from=expandtype
fromIndex=pages to=id score=avg v='pageno_content:(+geschichte +rom)'})
With the factor of 0.01 it seems to work well with our data. 

Best Regards

>>> Mikhail Khludnev <mkhlud...@griddynamics.com> 22.03.2016 12:44 >>>
what is you nest join into boost eg q=+foo {!boost ..}{!join ...


if it works, you may vote for

On Tue, Mar 22, 2016 at 12:39 PM, Alena Dengler <
alena.deng...@bsb-muenchen.de> wrote:

> Hello,
> we are currently developing a combined index for book metadata and
> fulltexts. Our primary core contains metadata of ~12Mio. books.
> of them have fulltexts; those fulltexts are indexed in a secondary
> This secondary core has one index document per fulltext page.
> We are joining all matching fulltext pages with the bookwise
> in the primary core. Currently we have the problem that scores for
> with matches from the secondary core are not comparable with matches
> from metadata only. So we are trying to normalize fulltext scores to
> in the same dimension as the metadata scores for non-digitized
> This is a basic query without join using only the primary core
> (metadata):
> http://server/solr/live/select?&q=+geschichte&fl=id,score 
> Top 10 result scores range from 2.0 to 1.7
> For fulltexts, the query is extended with a join:

> Top 10 result scores range from 5.4 to 4.8 (4.7 score points for the
> first hit result from the joined secondary core. We would like to
> this value. See explain output below [1])
> This difference will effectively hide any books without fulltexts
> hitlists, which is not our goal.
> We tried to add lucene boosts to the join subquery, but they do not
> have any effect on the final scores. E.g. we 'down boost' the
> results by a factor of 0.1:
> q=((+geschichte) OR _query_:{!join from=expandtype fromIndex=pages
> to=id score=max v='pageno_content:(+geschichte)^0.1'})
> But the resulting scores are the same as from the join example
> Is this the correct query syntax, or should the boost for the join
> query be put somewhere else?
> Thanks for any suggestions.
> Best Regards
> Alena
> [1] Explain output for the first hit of the join example query
> 5.398742 = sum of:
>   4.816505 = sum of:
>     0.07251295 = max of:
>       0.07251295 = weight(title:geschichte in 10585926)
> [ClassicSimilarity], result of:
>         0.07251295 = score(doc=10585926,freq=1.0), product of:
>           0.037440736 = queryWeight, product of:
>             5.1646385 = idf(docFreq=197504, maxDocs=12713278)
>             0.00724944 = queryNorm
>           1.9367394 = fieldWeight in 10585926, product of:
>             1.0 = tf(freq=1.0), with freq of:
>               1.0 = termFreq=1.0
>             5.1646385 = idf(docFreq=197504, maxDocs=12713278)
>             0.375 = fieldNorm(doc=10585926)
>       0.005904072 = weight(free_search:geschichte in 10585926)
> [ClassicSimilarity], result of:
>         0.005904072 = score(doc=10585926,freq=2.0), product of:
>           0.022005465 = queryWeight, product of:
>             3.035471 = idf(docFreq=1660594, maxDocs=12713278)
>             0.00724944 = queryNorm
>           0.26830027 = fieldWeight in 10585926, product of:
>             1.4142135 = tf(freq=2.0), with freq of:
>               2.0 = termFreq=2.0
>             3.035471 = idf(docFreq=1660594, maxDocs=12713278)
>             0.0625 = fieldNorm(doc=10585926)
>     4.743992 = Score based on join value 957245
>   0.58188105 = weight(statusband:F in 10585926) [ClassicSimilarity],
> result of:
>     0.58188105 = score(doc=10585926,freq=1.0), product of:
>       0.4592555 = queryWeight, product of:
>         50.0 = boost
>         1.2670095 = idf(docFreq=9734121, maxDocs=12713278)
>         0.00724944 = queryNorm
>       1.2670095 = fieldWeight in 10585926, product of:
>         1.0 = tf(freq=1.0), with freq of:
>           1.0 = termFreq=1.0
>         1.2670095 = idf(docFreq=9734121, maxDocs=12713278)
>         1.0 = fieldNorm(doc=10585926)
>   3.5596997E-4 =
> product of:
>     0.00491031 =
>     0.0724944 = boost
>     1.0 = queryNorm

Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


Reply via email to