This is the type of problem that's fun to think about...

: As for RandomSortField + function queries... I'm not sure I understand how I
: can use that to achieve what I need :-/

the RandomSortField was designed for simple sorting, ie...

  sort=random_1234 desc

...but it can also be used as the input to a function, and (as of 
recently) you cna sort on functions. so you could do something like...

  sort=product(price,random_3245) desc

  (https://wiki.apache.org/solr/FunctionQuery)

...which would cause the documents to be semi-randomly sorted, with higher 
priced products skewed to be more likely to appera higher up in the 
results.

So in your case, with your classifications of documents (A, B, C, etc...) 
if you can index a numeric value with each document indicating how much 
you want to "bias" the sort in favor of documents of that type (ie: the 
percentages you mentioned) you could use them that way.

but thta would require you to index those biases in advance.

another strategy you could use is to take advantage of the "map" function 
... assign simple numeic ids to each of your classificaitons (A=1, B=2, 
etc..) and index tose numberic ids as some field "code", and then at query 
time you can use the map function to translate them to your bias values...

  sort=product(map(map(map(code,1,1,50),2,2,30),3,3,40),random_3245) desc

...that would give "A" docs a bias of 50, "B" docs a bias of "30", "C" 
docs a bias of "40, etc...

With Solr 4.x, there will also be functions that let you get the docfreq 
of a term in the index, so you could use inverse functions to make the 
bias multipliers driven directly by how common a doc class is used ... but 
based on your description, it sounds like you want this to be more user 
driven anyway...

: > > was thinking I would essentially "boost" types B, C, D, E,
: > > F until all types
: > > are approximately evenly represented in the random
: > > assortment. (Or
: > > alternatively, if the user has an affinity for type B
: > > documents, further
: > > boost type B documents so that they're more likely to be
: > > represented than
: > > other types).

-Hoss

Reply via email to