Re: Random + Boost?

2011-08-17 Thread Chris Hostetter

This is the type of problem that's fun to think about...

: As for RandomSortField + function queries... I'm not sure I understand how I
: can use that to achieve what I need :-/

the RandomSortField was designed for simple sorting, ie...

  sort=random_1234 desc

...but it can also be used as the input to a function, and (as of 
recently) you cna sort on functions. so you could do something like...

  sort=product(price,random_3245) desc

  (https://wiki.apache.org/solr/FunctionQuery)

...which would cause the documents to be semi-randomly sorted, with higher 
priced products skewed to be more likely to appera higher up in the 
results.

So in your case, with your classifications of documents (A, B, C, etc...) 
if you can index a numeric value with each document indicating how much 
you want to bias the sort in favor of documents of that type (ie: the 
percentages you mentioned) you could use them that way.

but thta would require you to index those biases in advance.

another strategy you could use is to take advantage of the map function 
... assign simple numeic ids to each of your classificaitons (A=1, B=2, 
etc..) and index tose numberic ids as some field code, and then at query 
time you can use the map function to translate them to your bias values...

  sort=product(map(map(map(code,1,1,50),2,2,30),3,3,40),random_3245) desc

...that would give A docs a bias of 50, B docs a bias of 30, C 
docs a bias of 40, etc...

With Solr 4.x, there will also be functions that let you get the docfreq 
of a term in the index, so you could use inverse functions to make the 
bias multipliers driven directly by how common a doc class is used ... but 
based on your description, it sounds like you want this to be more user 
driven anyway...

:   was thinking I would essentially boost types B, C, D, E,
:   F until all types
:   are approximately evenly represented in the random
:   assortment. (Or
:   alternatively, if the user has an affinity for type B
:   documents, further
:   boost type B documents so that they're more likely to be
:   represented than
:   other types).

-Hoss


Re: Random + Boost?

2011-08-16 Thread Ahmet Arslan
 This might seem odd, but is it possible to use boost with
 random ordering?
 That is, documents that get boosted are more likely to
 appear towards the
 top of the ordering (I only display page 1, say 30
 documents). Does that
 make sense? I'm assuming that random ordering is, well,
 really random - so
 then it's not possible. But I figured I'd ask.
 
 My problem is that I want to display a random assortment of
 documents, but
 unfortunately certain types of documents far outnumber
 other types. So a
 random assortment ends up with 50% type A, 50% type B, C,
 D, E, F. So, I
 was thinking I would essentially boost types B, C, D, E,
 F until all types
 are approximately evenly represented in the random
 assortment. (Or
 alternatively, if the user has an affinity for type B
 documents, further
 boost type B documents so that they're more likely to be
 represented than
 other types).
 
 Anyone know if there's a way to do something like this in
 Solr?

Sounds like you want to achieve diversity of results.

Consider using http://wiki.apache.org/solr/FieldCollapsing

Alternatively you can make use of RandomSortField with function queries.
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html


Re: Random + Boost?

2011-08-16 Thread Alexei Martchenko
To make random results i'd use something related to dates and milliseconds,
not boosting. lemme think about this...

2011/8/16 Ahmet Arslan iori...@yahoo.com

  This might seem odd, but is it possible to use boost with
  random ordering?
  That is, documents that get boosted are more likely to
  appear towards the
  top of the ordering (I only display page 1, say 30
  documents). Does that
  make sense? I'm assuming that random ordering is, well,
  really random - so
  then it's not possible. But I figured I'd ask.
 
  My problem is that I want to display a random assortment of
  documents, but
  unfortunately certain types of documents far outnumber
  other types. So a
  random assortment ends up with 50% type A, 50% type B, C,
  D, E, F. So, I
  was thinking I would essentially boost types B, C, D, E,
  F until all types
  are approximately evenly represented in the random
  assortment. (Or
  alternatively, if the user has an affinity for type B
  documents, further
  boost type B documents so that they're more likely to be
  represented than
  other types).
 
  Anyone know if there's a way to do something like this in
  Solr?

 Sounds like you want to achieve diversity of results.

 Consider using http://wiki.apache.org/solr/FieldCollapsing

 Alternatively you can make use of RandomSortField with function queries.

 http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html




-- 

*Alexei Martchenko* | *CEO* | Superdownloads
ale...@superdownloads.com.br | ale...@martchenko.com.br | (11)
5083.1018/5080.3535/5080.3533


Re: Random + Boost?

2011-08-16 Thread Steven Ou
Thank you for the response! I'm learning much about Solr... So I think
FieldCollapsing might do the trick... So if I understand correctly, I should
be able to group by type A, B, C, D, E, F, sort groups randomly, sort within
groups randomly, display simple format, and get an evenly distributed set of
results across A, B, C, D, E, F...

But I'm not sure it's exactly what I need:

   - If I have, in this case, 6 document types, how do I return 25
   documents? From what I understand, I need to set how many groups (let's say
   all 6) * how many documents per group (closest would be 4) = 24 documents.
   Would I simply need to return more documents and just process the first 25?
   - Also, this appears to me to return *exactly* evenly distributed
   results. How do I:
  - Make certain types a little more likely to appear, or
  - If, suppose, I have only 2 documents in type F and I query for all 6
  groups, at 5 documents per group (for a total of 30 documents), will the
  fact that type F only has 2 documents result in a query that
only returns 27
  documents? If so, I would want to select the missing 3 documents
evenly from
  the other 5 types.

As for RandomSortField + function queries... I'm not sure I understand how I
can use that to achieve what I need :-/
--
Steven Ou | 歐偉凡

*ravn.com* | Chief Technology Officer
steve...@gmail.com | +1 909-569-9880


On Tue, Aug 16, 2011 at 12:00 PM, Ahmet Arslan iori...@yahoo.com wrote:

  This might seem odd, but is it possible to use boost with
  random ordering?
  That is, documents that get boosted are more likely to
  appear towards the
  top of the ordering (I only display page 1, say 30
  documents). Does that
  make sense? I'm assuming that random ordering is, well,
  really random - so
  then it's not possible. But I figured I'd ask.
 
  My problem is that I want to display a random assortment of
  documents, but
  unfortunately certain types of documents far outnumber
  other types. So a
  random assortment ends up with 50% type A, 50% type B, C,
  D, E, F. So, I
  was thinking I would essentially boost types B, C, D, E,
  F until all types
  are approximately evenly represented in the random
  assortment. (Or
  alternatively, if the user has an affinity for type B
  documents, further
  boost type B documents so that they're more likely to be
  represented than
  other types).
 
  Anyone know if there's a way to do something like this in
  Solr?

 Sounds like you want to achieve diversity of results.

 Consider using http://wiki.apache.org/solr/FieldCollapsing

 Alternatively you can make use of RandomSortField with function queries.

 http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html