are you boosting your docs? 2011/8/8 Jason Toy <jason...@gmail.com>
> I am trying to test out and compare different sorts and scoring. > > When I use dismax to search for "indie music" > with: qf=all_lists_text&q="indie+music"&defType=dismax&rows=100 > I see some stuff that seems "irrelevant", meaning in top results I see only > 1 or 2 mentions of "indie music", but when I look further down the list I > do > see other docs that have more occurrences of "indie music". > So I a want to test by comparing the the different queries versus seeing a > list of docs ranked specifically by the count of occurrences of the phrase > "indie music" > > On Mon, Aug 8, 2011 at 2:19 PM, Markus Jelsma <markus.jel...@openindex.io > >wrote: > > > > > > Dismax queries can. But > > > > > > sort=termfreq(all_lists_text,'indie+music') > > > > > > is not using dismax. Apparenty termfreq function can not? I am not > > > familiar with the termfreq function. > > > > It simply returns the TF of the given _term_ as it is indexed of the > > current > > document. > > > > Sorting on TF like this seems strange as by default queries are already > > sorted > > that way since TF plays a big role in the final score. > > > > > > > > To understand why you'd need to reindex, you might want to read up on > how > > > lucene actually works, to get a basic understanding of how different > > > indexing choices effect what is possible at query time. Lucene In > Action > > > is a pretty good book. > > > > > > On 8/8/2011 5:02 PM, Jason Toy wrote: > > > > Are not Dismax queries able to search for phrases using the default > > > > index(which is what I am using?) If I can already do phrase > searches, > > I > > > > don't understand why I would need to reindex t be able to access > > phrases > > > > from a function. > > > > > > > > On Mon, Aug 8, 2011 at 1:49 PM, Markus > > Jelsma<markus.jel...@openindex.io>wrote: > > > >>> Aelexei, thank you , that does seem to work. > > > >>> > > > >>> My sort results seem to be totally wrong though, I'm not sure if > its > > > >>> because of my sort function or something else. > > > >>> > > > >>> My query consists of: > > > >>> sort=termfreq(all_lists_text,'indie+music')+desc&q=*:*&rows=100 > > > >>> And I get back 4571232 hits. > > > >> > > > >> That's normal, you issue a catch all query. Sorting should work > but.. > > > >> > > > >>> All the results don't have the phrase "indie music" anywhere in > their > > > >> > > > >> data. > > > >> > > > >>> Does termfreq not support phrases? > > > >> > > > >> No, it is TERM frequency and indie music is not one term. I don't > know > > > >> how this function parses your input but it might not understand your > + > > > >> escape and > > > >> think it's one term constisting of exactly that. > > > >> > > > >>> If not, how can I sort specifically by termfreq of a phrase? > > > >> > > > >> You cannot. What you can do is index multiple terms as one term > using > > > >> the shingle filter. Take care, it can significantly increase your > > index > > > >> size and > > > >> number of unique terms. > > > >> > > > >>> On Mon, Aug 8, 2011 at 1:08 PM, Alexei Martchenko< > > > >>> > > > >>> ale...@superdownloads.com.br> wrote: > > > >>>> You can use the standard query parser and pass q=*:* > > > >>>> > > > >>>> 2011/8/8 Jason Toy<jason...@gmail.com> > > > >>>> > > > >>>>> I am trying to list some data based on a function I run , > > > >>>>> specifically termfreq(post_text,'indie music') and I am unable > to > > > >> > > > >> do > > > >> > > > >>>>> it without passing in data to the q paramater. Is it possible to > > get > > > >>>>> a > > > >>>> > > > >>>> sorted > > > >>>> > > > >>>>> list without searching for any terms? > > > >>>> > > > >>>> -- > > > >>>> > > > >>>> *Alexei Martchenko* | *CEO* | Superdownloads > > > >>>> ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) > > > >>>> 5083.1018/5080.3535/5080.3533 > > > > > > -- > - sent from my mobile > 6176064373 > -- *Alexei Martchenko* | *CEO* | Superdownloads ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) 5083.1018/5080.3535/5080.3533