I am trying to test out and compare different sorts and scoring. When I use dismax to search for "indie music" with: qf=all_lists_text&q="indie+music"&defType=dismax&rows=100 I see some stuff that seems "irrelevant", meaning in top results I see only 1 or 2 mentions of "indie music", but when I look further down the list I do see other docs that have more occurrences of "indie music". So I a want to test by comparing the the different queries versus seeing a list of docs ranked specifically by the count of occurrences of the phrase "indie music"
On Mon, Aug 8, 2011 at 2:19 PM, Markus Jelsma <markus.jel...@openindex.io>wrote: > > > Dismax queries can. But > > > > sort=termfreq(all_lists_text,'indie+music') > > > > is not using dismax. Apparenty termfreq function can not? I am not > > familiar with the termfreq function. > > It simply returns the TF of the given _term_ as it is indexed of the > current > document. > > Sorting on TF like this seems strange as by default queries are already > sorted > that way since TF plays a big role in the final score. > > > > > To understand why you'd need to reindex, you might want to read up on how > > lucene actually works, to get a basic understanding of how different > > indexing choices effect what is possible at query time. Lucene In Action > > is a pretty good book. > > > > On 8/8/2011 5:02 PM, Jason Toy wrote: > > > Are not Dismax queries able to search for phrases using the default > > > index(which is what I am using?) If I can already do phrase searches, > I > > > don't understand why I would need to reindex t be able to access > phrases > > > from a function. > > > > > > On Mon, Aug 8, 2011 at 1:49 PM, Markus > Jelsma<markus.jel...@openindex.io>wrote: > > >>> Aelexei, thank you , that does seem to work. > > >>> > > >>> My sort results seem to be totally wrong though, I'm not sure if its > > >>> because of my sort function or something else. > > >>> > > >>> My query consists of: > > >>> sort=termfreq(all_lists_text,'indie+music')+desc&q=*:*&rows=100 > > >>> And I get back 4571232 hits. > > >> > > >> That's normal, you issue a catch all query. Sorting should work but.. > > >> > > >>> All the results don't have the phrase "indie music" anywhere in their > > >> > > >> data. > > >> > > >>> Does termfreq not support phrases? > > >> > > >> No, it is TERM frequency and indie music is not one term. I don't know > > >> how this function parses your input but it might not understand your + > > >> escape and > > >> think it's one term constisting of exactly that. > > >> > > >>> If not, how can I sort specifically by termfreq of a phrase? > > >> > > >> You cannot. What you can do is index multiple terms as one term using > > >> the shingle filter. Take care, it can significantly increase your > index > > >> size and > > >> number of unique terms. > > >> > > >>> On Mon, Aug 8, 2011 at 1:08 PM, Alexei Martchenko< > > >>> > > >>> ale...@superdownloads.com.br> wrote: > > >>>> You can use the standard query parser and pass q=*:* > > >>>> > > >>>> 2011/8/8 Jason Toy<jason...@gmail.com> > > >>>> > > >>>>> I am trying to list some data based on a function I run , > > >>>>> specifically termfreq(post_text,'indie music') and I am unable to > > >> > > >> do > > >> > > >>>>> it without passing in data to the q paramater. Is it possible to > get > > >>>>> a > > >>>> > > >>>> sorted > > >>>> > > >>>>> list without searching for any terms? > > >>>> > > >>>> -- > > >>>> > > >>>> *Alexei Martchenko* | *CEO* | Superdownloads > > >>>> ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) > > >>>> 5083.1018/5080.3535/5080.3533 > -- - sent from my mobile 6176064373