RE: Need OR in DisMax Query

2009-10-06 Thread Dean Missikowski (Consultant), CLSA
Hi David,

See this thread for how I use OR with Dismax.
http://www.mail-archive.com/solr-user@lucene.apache.org/msg19375.html

-- Dean

-Original Message-
From: Ingo Renner [mailto:i...@typo3.org] 
Sent: 06 October 2009 05:00
To: solr-user@lucene.apache.org
Subject: Re: Need OR in DisMax Query


Am 05.10.2009 um 20:36 schrieb David Giffin:

Hi David,

 Maybe I'm missing something, but I can't seem to get the dismax
 request handler to perform and OR query. It appears that OR is removed
 by the stop words.

It's not the stop words, Dismax simply doesn't do any boolean  
operations, the only thing you can do is using +searchWord and - 
searchWord or changing to the standard request handler.


best
Ingo

-- 
Ingo Renner
TYPO3 Core Developer, Release Manager TYPO3 4.2

Apache Solr for TYPO3: http://www.typo3-solr.com



CLSA CLEAN  GREEN: Please consider our environment before printing this email. 
The content of this communication is subject to CLSA Legal and Regulatory 
Notices
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.



crazy parentheses

2009-04-02 Thread Dean Missikowski (Consultant), CLSA
I've got a problem that's driving me crazy with parentheses.

 

I'm using a recent nightly Solr 1.4

 

My index includes these three docs. 

 

doc #1 has title: saints  sinners

doc #2 has title: (saints and sinners)

doc #3 has title: ( saints  sinners )

doc #4 has title: (saints  sinners)

 

when I try any of these searches:

  title:saints  sinners 

  title:saints  sinners

  title:saints and sinners

 

Only docs  #1-3 are found, but doc #4 should match too?

 

The analyzer shows that the tokenizer and filters should find a match.  

I'm guessing this might be a bug with WordDelimiterFactory?

 

I've worked around by using a PatternReplaceFilterFactory to strip off
the parentheses.

filter class=solr.PatternReplaceFilterFactory pattern=[()]
replacement= replace=all/

 

Any ideas?

 

Thanks, Dean

 

 

Index Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}

term position 1  2  3

term text   (saintssinners)

term type  word word word

source start,end 0,78,910,18

payload 

org.apache.solr.analysis.WordDelimiterFilterFactory {catenateWords=1,
catenateNumbers=1, catenateAll=0, generateNumberParts=1,
generateWordParts=1}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}

term position 1  3

term text   saints   sinners

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.EnglishPorterFilterFactory
{protected=protwords.txt}

term position 1  3

term text   saint sinner

term type  word word

source start,end 1,710,17

payload 

org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1  3

term text   saint sinner

term type  word word

source start,end 1,710,17

payload 

Query Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}

term position 1  2  3

term text   saints sinners

term type  word word word

source start,end 0,67,89,16

payload 

org.apache.solr.analysis.WordDelimiterFilterFactory {catenateWords=1,
catenateNumbers=1, catenateAll=0, generateNumberParts=1,
generateWordParts=1}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.LowerCaseFilterFactory {}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.SynonymFilterFactory {expand=true,
ignoreCase=true, synonyms=synonyms.txt}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true}

term position 1  2

term text   saints   sinners

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.EnglishPorterFilterFactory
{protected=protwords.txt}

term position 1  2

term text   saint sinner

term type  word word

source start,end 0,69,16

payload 

org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}

term position 1  2

term text   saint sinner

term type  word word

source start,end 0,69,16

payload 


CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




RE: Exact Match

2009-03-19 Thread Dean Missikowski (Consultant), CLSA
Hi Laurent,

I use the copy field approach and copy the text fields to a custom type
text_exact that I define in my schema.xml. This allows searching for
exact matches anywhere within the text field, which doesn't use tokens
injected by stemming, synonyms or other index-time filters. 

In my application code, I detect when users are performing an exact
match and set up the underlying solr query to use the text_exact
fields by specifying to use an exact match request handler (a modified
definition of the standard dismax request handler in solrconfig.xml) 

Depending on your needs, you might want to do some sort of minimal
analysis on the field (ignore punctuation, lowercase,...) Here's the
text_exact field that I use:

fieldtype name=text_exact class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=0 generateNumberParts=0 catenateWords=0
catenateNumbers=0 catenateAll=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldtype

-- Dean

-Original Message-
From: Vauthrin, Laurent [mailto:laurent.vauth...@disney.com] 
Sent: 20/03/2009 6:13 AM
To: solr-user@lucene.apache.org
Subject: Exact Match

Hello again,

 

I believe that this question has been posed before but I just wanted to
make sure I understood my options.  Here's the situation:

 

We have a few fields that are specified as 'text' and a few field that
are specified as 'string'.  As far as I understand, 'string' will do
exact matches whereas 'text' will do tokenized/contains matches.
However, we have a need to do exact matches on the 'text' field as well.

 

I believe I've seen two approaches for this problem:

1.   Using a copyField configuration and copy the 'text' field to a
'string' field.  Then use the string field when exact matches are
needed.

2.   Append something like '_start_' and '_end_' to the field at
index and search time for exact matches.

 

Are there any solutions to this problem that don't require creating
another field or modifying the data?  (i.e. some sort of query filter?)

 

Thanks,
Laurent


CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




RE: Operators and Minimum Match with Dismax handler

2009-03-17 Thread Dean Missikowski (Consultant), CLSA
I'm using dismax with the default operator set to AND, and don't use
Minimum Match (commented out in solrconfig.xml), meaning 100% of the
terms must match.  Then in my application logic I use a regex that
checks if the query contains  OR , and if it does I add mm=1 to the
solr request to effectively turn the query into an OR. This trick
doesn't work for complex boolean queries but works for simple xxx OR
yyy. 

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: 18/03/2009 10:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Operators and Minimum Match with Dismax handler

: I have an index which we are setting the default operator to AND.
: Am I right in saying that using the dismax handler, the default
operator in
: the schema file is effectively ignored? (This is the conclusion I've
made
: from testing myself)

correct.

: The issue I have with this, is that if I want to include an OR in my
phrase,
: these are effectively getting ignored. The parser is still trying to
match
: 100% of the search terms
: 
: e.g. 'lucene OR query' still only finds matches for 'lucene AND query'
: the parsed query is: +(((drug_name:lucen) (drug_name:queri))~2) ()

correct.  dismax isn't designed to be used that way (it's a fluke of 
the implementation that using  AND  works at all)

: Does anyone have any advise as to how I could deal with this kkind of
: problem?

i would set your mm to something smaller and let your users use + when

they want to make something required.  if you really want to support the

AND/OR/NOT type sequence  ... don't use dismax: that type of syntax is 
what the standard parser is for.


-Hoss


CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




RE: How to correctly boost results in Solr Dismax query

2009-03-16 Thread Dean Missikowski (Consultant), CLSA
If you just discovered the omitTf parameter because of this post, please
be aware that I've not really explained it's purpose properly and note
that using it will prevent phrase queries from working. See this thread
for clarification on it's use here:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200903.mbox/%3
c897559.95769...@web50301.mail.re2.yahoo.com%3e

-- Dean

-Original Message-
From: Dean Missikowski (Consultant), CLSA 
Sent: 16/03/2009 10:30 AM
To: solr-user@lucene.apache.org
Subject: RE: How to correctly boost results in Solr Dismax query

Hi,

My experience is that the BQ parameter can be used with any query type.
You can define boosts on the query fields (qf) that are used with the
query terms (q) in your query, AND you can define additional boosts for
fields that are not used with the query terms through the bq or bf
parameters. 

I think the relative weight that assigning a particular boost to a field
via BQ has on the overall scoring needs to take into consideration the
other fields in your query. If you're searching on titles, you might
want to consider setting omitNorms=true (means don't generate length
normalization vectors) for title in your schema.xml, and if you're using
Solr 1.4 omitTf=true (means don't generate term frequency vectors), so
that results aren't skewed by short and long titles, or titles that
contain multiple occurrences of the same term (setting these requires
you to reindex). I think this should have the effect of making BQ boosts
like bq=media:DVD^2bq=media:BLU-RAY^1.5 more effective. 

-- Dean

-Original Message-
From: Pete Smith [mailto:pete.sm...@lovefilm.com] 
Sent: 13/03/2009 7:11 PM
To: solr-user@lucene.apache.org
Subject: Re: How to correctly boost results in Solr Dismax query

Hi,

On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote:
 bq works only with q.alt query and not with q queries. So, in your
case you
 would be using qf parameter for field boosting, you will have to give
both
 the fields in qf parameter i.e. both title and media.
 
 try this
 
 str name=qfmedia^1.0 title^100.0/str

But with that, how will it know to rank media:DVD higher than
media:BLU-RAY?

Cheers,
Pete


 Pete Smith-3 wrote:
  
  Hi Amit,
  
  Thanks again for your reply. I am understanding it a bit better but
I
  think it would help if I posted an example. Say I have three
records:
  
  doc
  long name=id1/long
  str name=mediaBLU-RAY/str
  str name=titleIndiana Jones and the Kingdom of the Crystal
  Skull/str
  /doc
  doc
  long name=id2/long
  str name=mediaDVD/str
  str name=titleIndiana Jones and the Kingdom of the Crystal
  Skull/str
  /doc
  doc
  long name=id3/long
  str name=mediaDVD/str
  str name=titleCasino Royale/str
  /doc
  
  Now, if I search for indiana: select?q=indiana
  
  I want the first two rows to come back (not the third as it does not
  contain 'indiana'). I would like record 2 to be scored higher than
  record 1 as it's media type is DVD.
  
  At the moment I have in my config:
  
  str name=qftitle/str
  
  And i was trying to boost by media having a specific value by using
'bq'
  but from what you told me that is incorrect.
  
  Cheers,
  Pete
  
  
  On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote:
  Pete,
  
  Sorry, if wasnt clear. Here is the explanation.
  
  Suppose you have 2 records and they have films and media as 2
columns.
  
  Now first record has values like films=Indiana and media=blue
ray
  and 2nd record has values like films=Bond and media=Indiana
  
  Values for qf parameters
  
  str name=qfmedia^2.0 films^1.0/str
  
  Now, search for q=Indiana .. it should display both of the records
but
  record #2 will display above than the 1st.
  
  Let me know if you still have questions.
  
  Cheers,
  amit
  
  
  Pete Smith-3 wrote:
   
   Hi Amit,
   
   Thanks very much for your reply. What you said makes things a bit
   clearer but I am still a bit confused.
   
   On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote:
   If you want to boost the records with their field value then you
must
  use
   q
   query parameter instead of q.alt. 'q' parameter actually uses qf
   parameters
   from solrConfig for field boosting.
   
  From the documentation for Dismax queries, I thought that q is
simply
   a keyword parameter:
   
  From http://wiki.apache.org/solr/DisMaxRequestHandler:
   q
   The guts of the search defining the main query. This is
designed to
  be
   support raw input strings provided by users with no special
escaping.
   '+' and '-' characters are treated as mandatory and
prohibited
   modifiers for the subsequent terms. Text wrapped in balanced
quote
   characters '' are treated as phrases, any query containing an
odd
   number of quote characters is evaluated as if there were no quote
   characters at all. Wildcards in this q parameter are not
supported. 
   
   And I thought 'qf' is a list of fields and boost scores:
   
  From http://wiki.apache.org/solr/DisMaxRequestHandler:
   qf (Query Fields)
   List

RE: How to correctly boost results in Solr Dismax query

2009-03-15 Thread Dean Missikowski (Consultant), CLSA
Hi,

My experience is that the BQ parameter can be used with any query type.
You can define boosts on the query fields (qf) that are used with the
query terms (q) in your query, AND you can define additional boosts for
fields that are not used with the query terms through the bq or bf
parameters. 

I think the relative weight that assigning a particular boost to a field
via BQ has on the overall scoring needs to take into consideration the
other fields in your query. If you're searching on titles, you might
want to consider setting omitNorms=true (means don't generate length
normalization vectors) for title in your schema.xml, and if you're using
Solr 1.4 omitTf=true (means don't generate term frequency vectors), so
that results aren't skewed by short and long titles, or titles that
contain multiple occurrences of the same term (setting these requires
you to reindex). I think this should have the effect of making BQ boosts
like bq=media:DVD^2bq=media:BLU-RAY^1.5 more effective. 

-- Dean

-Original Message-
From: Pete Smith [mailto:pete.sm...@lovefilm.com] 
Sent: 13/03/2009 7:11 PM
To: solr-user@lucene.apache.org
Subject: Re: How to correctly boost results in Solr Dismax query

Hi,

On Fri, 2009-03-13 at 03:57 -0700, dabboo wrote:
 bq works only with q.alt query and not with q queries. So, in your
case you
 would be using qf parameter for field boosting, you will have to give
both
 the fields in qf parameter i.e. both title and media.
 
 try this
 
 str name=qfmedia^1.0 title^100.0/str

But with that, how will it know to rank media:DVD higher than
media:BLU-RAY?

Cheers,
Pete


 Pete Smith-3 wrote:
  
  Hi Amit,
  
  Thanks again for your reply. I am understanding it a bit better but
I
  think it would help if I posted an example. Say I have three
records:
  
  doc
  long name=id1/long
  str name=mediaBLU-RAY/str
  str name=titleIndiana Jones and the Kingdom of the Crystal
  Skull/str
  /doc
  doc
  long name=id2/long
  str name=mediaDVD/str
  str name=titleIndiana Jones and the Kingdom of the Crystal
  Skull/str
  /doc
  doc
  long name=id3/long
  str name=mediaDVD/str
  str name=titleCasino Royale/str
  /doc
  
  Now, if I search for indiana: select?q=indiana
  
  I want the first two rows to come back (not the third as it does not
  contain 'indiana'). I would like record 2 to be scored higher than
  record 1 as it's media type is DVD.
  
  At the moment I have in my config:
  
  str name=qftitle/str
  
  And i was trying to boost by media having a specific value by using
'bq'
  but from what you told me that is incorrect.
  
  Cheers,
  Pete
  
  
  On Fri, 2009-03-13 at 03:21 -0700, dabboo wrote:
  Pete,
  
  Sorry, if wasnt clear. Here is the explanation.
  
  Suppose you have 2 records and they have films and media as 2
columns.
  
  Now first record has values like films=Indiana and media=blue
ray
  and 2nd record has values like films=Bond and media=Indiana
  
  Values for qf parameters
  
  str name=qfmedia^2.0 films^1.0/str
  
  Now, search for q=Indiana .. it should display both of the records
but
  record #2 will display above than the 1st.
  
  Let me know if you still have questions.
  
  Cheers,
  amit
  
  
  Pete Smith-3 wrote:
   
   Hi Amit,
   
   Thanks very much for your reply. What you said makes things a bit
   clearer but I am still a bit confused.
   
   On Thu, 2009-03-12 at 23:14 -0700, dabboo wrote:
   If you want to boost the records with their field value then you
must
  use
   q
   query parameter instead of q.alt. 'q' parameter actually uses qf
   parameters
   from solrConfig for field boosting.
   
  From the documentation for Dismax queries, I thought that q is
simply
   a keyword parameter:
   
  From http://wiki.apache.org/solr/DisMaxRequestHandler:
   q
   The guts of the search defining the main query. This is
designed to
  be
   support raw input strings provided by users with no special
escaping.
   '+' and '-' characters are treated as mandatory and
prohibited
   modifiers for the subsequent terms. Text wrapped in balanced
quote
   characters '' are treated as phrases, any query containing an
odd
   number of quote characters is evaluated as if there were no quote
   characters at all. Wildcards in this q parameter are not
supported. 
   
   And I thought 'qf' is a list of fields and boost scores:
   
  From http://wiki.apache.org/solr/DisMaxRequestHandler:
   qf (Query Fields)
   List of fields and the boosts to associate with each of them
when
   building DisjunctionMaxQueries from the user's query. The format
   supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which
indicates that
   fieldOne has a boost of 2.3, fieldTwo has the default boost, and
   fieldThree has a boost of 0.4 ... this indicates that matches in
   fieldOne are much more significant than matches in fieldTwo,
which are
   more significant than matches in fieldThree. 
   
   But if I want to, say, search for films with 'indiana' in the
title,
   with media=DVD scoring higher than 

RE: Query Boosting using both BQ and BF

2009-03-09 Thread Dean Missikowski (Consultant), CLSA
I found similar results when trying to use negative boost with values  1. 
Chris mentioned in this thread http://markmail.org/message/d2dc4oocrynx7wj2 a 
way to implement negative boost is to add positive boost to everything except 
the case you want to demote.
So, str name=bq-type:story^2.0/str.

-Original Message-
From: Peter Wolanin [mailto:peter.wola...@acquia.com] 
Sent: 10/03/2009 12:37 AM
To: solr-user@lucene.apache.org
Subject: Re: Query Boosting using both BQ and BF

This doesn't seem to match what I'm seeing in terms of using bq -
using any value  0 increases the score.  For example, with no bq:

  str name=qsolr/str
  str name=fltitle,score,type/str
  str name=version2.2/str
 /lst
/lst
result name=response numFound=35 start=0 maxScore=1.6885357
 doc

  float name=score1.6885357/float
  str name=titleBuilding a killer search for Drupal/str
  str name=typewikipage/str
 /doc
 doc
  float name=score1.5547959/float
  str name=titleNew Solr module available for testing/str

  str name=typestory/str
 /doc
 doc
  float name=score1.408378/float
  str name=titleCheck out the Solr project!/str
  str name=typestory/str
 /doc


Versus with a bq  1, the scores of matching docs are still increased
compared to using no bq:

  str name=bqtype:story^0.5/str
  str name=indenton/str
  str name=qsolr/str
  str name=fltitle,score,type/str
  str name=version2.2/str
 /lst

/lst
result name=response numFound=35 start=0 maxScore=1.6885297
 doc
  float name=score1.6885297/float
  str name=titleBuilding a killer search for Drupal/str
  str name=typewikipage/str
 /doc
 doc
  float name=score1.5585454/float
  str name=titleNew Solr module available for testing/str
  str name=typestory/str
 /doc
 doc
  float name=score1.4121282/float
  str name=titleCheck out the Solr project!/str
  str name=typestory/str
 /doc



On Sun, Mar 8, 2009 at 9:48 AM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:

 Also note that the following is not doing what you want:

 str name=bq-is_mp_parent_b:true^50.0/str

 You want something like:

 str name=bqis_mp_parent_b:true^0.20/str

 for negative boosting use a boost that is less than 1.0.

 Otis--
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
 From: Dean Missikowski (Consultant), CLSA dean.missikow...@clsa.com
 To: solr-user@lucene.apache.org
 Sent: Sunday, March 8, 2009 3:30:28 AM
 Subject: RE: Query Boosting using both BQ and BF

 Some more experiments have helped me answer my own questions.

  Q1. Can anyone confirm whether bf and bq can both
  be used together in solrconfig.xml?
 yes

  Q3. Can I have multiple bq parameters? If so, do I
  space-separate them as a single bq, or provide
  multiple BQs?
 Yes, multiple bq parameters work, space-separating multiple query terms
 in a single bq also works.

 Here's a snippet of my solrconfig.xml:

 published_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^5.2 OR
 (published_date_d:[NOW-12MONTHS/DAY TO NOW/DAY+1DAY] AND
 report_type_id_i:1004)^10.0 OR (published_date_d:[NOW-6MONTHS/DAY TO
 NOW/DAY+1DAY] AND is_printed_b:true)^4.0
 is_mp_parent_b:false^10.0
 recip(rord(published_date_d),20,5000,5)^5.5


 -Original Message-
 From: Dean Missikowski (Consultant), CLSA
 Sent: 08/03/2009 12:01 PM
 To: solr-user@lucene.apache.org
 Subject: Query Boosting using both BQ and BF

 Hi,

 I have a couple of questions related to query boosting using the dismax
 request handler. I'm using a recent 1.4 build (to take advantage of
 omitTf), and have also tried all of this with 1.3.

 To apply a query-time boost to the previous 3 months of documents in my
 index I use:

 published_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2


 And, to provide a boosting that helps rank recently documents higher I
 use:
 recip(rord(published_date_d),20,5000,5)^5.5

 This seems to be working well.

 But I have more boosting requirements.  For example, I need to boost
 documents that are tagged as printed. So, I tried to add another bq
 parameter:

 is_printed_b:true^4.0

 Also, tried to append this space-separated all in one bq parameter like
 this:
 published_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2
 is_printed_b:true^4.0

 Lastly, I need to apply a negative boost to documents of a certain type,
 so I use:
 -is_mp_parent_b:true^50.0

 Not sure if it matters, but I have
 defaultOperator=AND/ in schema.xml

 None of those variations return expected results (it's like the bq is
 being applied as a filter instead of just applying boosts).

 Q1. Can anyone confirm whether bf and bq can both be used together in
 solrconfig.xml?
 Q2. Is there a way I can do ths using only BF? How?
 Q3. Can I have multiple bq parameters? If so, do I space-separate them
 as a single bq, or provide multiple BQs?
 Q3. Am I formulating my BQs that use Boolean fields correctly?

 Any help or insights much appreciated,

 Thanks Dean

 CLSA CLEAN  GREEN: Please consider our environment before printing this 
 email.
 The content of this communication

Relevancy and date sorting in same search?

2009-03-08 Thread Dean Missikowski (Consultant), CLSA
Hi,

 

I'm trying to think of a way to use both relevancy and date sorting in
the same search. If documents are recent (say published within the last
2 years), I want to use all of the boost functions, BQ parameters, and
normal Lucene scoring functions, but for documents older than two years,
I don't want any of those scores to apply - only a sort by date.

 

My first thought was to perform two separate searches and merge them
together in the controller/UI (not looking forward to managing the
pagination issues). I'm wondering if I could maybe achieve a similar
effect in a single search with maybe a custom Similarity class? 

 

Any ideas much appreciated.


CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




Query Boosting using both BQ and BF

2009-03-07 Thread Dean Missikowski (Consultant), CLSA
Hi,

 

I have a couple of questions related to query boosting using the dismax
request handler. I'm using a recent 1.4 build (to take advantage of
omitTf), and have also tried all of this with 1.3.

 

To apply a query-time boost to the previous 3 months of documents in my
index I use:

str name=bqpublished_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2
/str

 

And, to provide a boosting that helps rank recently documents higher I
use:

str name=bfrecip(rord(published_date_d),20,5000,5)^5.5/str

 

This seems to be working well.

 

But I have more boosting requirements.  For example, I need to boost
documents that are tagged as printed. So, I tried to add another bq
parameter:

str name=bqis_printed_b:true^4.0/str

 

Also, tried to append this space-separated all in one bq parameter like
this:

str name=bqpublished_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2
is_printed_b:true^4.0/str 

 

Lastly, I need to apply a negative boost to documents of a certain type,
so I use:

str name=bq-is_mp_parent_b:true^50.0/str

 

Not sure if it matters, but I have solrQueryParser
defaultOperator=AND/ in schema.xml 

 

None of those variations return expected results (it's like the bq is
being applied as a filter instead of just applying boosts).

 

Q1. Can anyone confirm whether bf and bq can both be used together in
solrconfig.xml?

Q2. Is there a way I can do ths using only BF? How? 

Q2. Can I have multiple bq parameters? If so, do I space-separate them
as a single bq, or provide multiple BQs?

Q3. Am I formulating my BQs that use Boolean fields correctly?

 

Any help or insights much appreciated,

Thanks Dean

 


CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.




RE: Query Boosting using both BQ and BF

2009-03-07 Thread Dean Missikowski (Consultant), CLSA
Some more experiments have helped me answer my own questions.

 Q1. Can anyone confirm whether bf and bq can both 
 be used together in solrconfig.xml?
yes

 Q3. Can I have multiple bq parameters? If so, do I 
 space-separate them as a single bq, or provide 
 multiple BQs?
Yes, multiple bq parameters work, space-separating multiple query terms
in a single bq also works.

Here's a snippet of my solrconfig.xml:

str name=bqpublished_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^5.2 OR
(published_date_d:[NOW-12MONTHS/DAY TO NOW/DAY+1DAY] AND
report_type_id_i:1004)^10.0 OR (published_date_d:[NOW-6MONTHS/DAY TO
NOW/DAY+1DAY] AND is_printed_b:true)^4.0/str
str name=bqis_mp_parent_b:false^10.0/str
str name=bfrecip(rord(published_date_d),20,5000,5)^5.5/str


-Original Message-
From: Dean Missikowski (Consultant), CLSA 
Sent: 08/03/2009 12:01 PM
To: solr-user@lucene.apache.org
Subject: Query Boosting using both BQ and BF 

Hi,

I have a couple of questions related to query boosting using the dismax
request handler. I'm using a recent 1.4 build (to take advantage of
omitTf), and have also tried all of this with 1.3.

To apply a query-time boost to the previous 3 months of documents in my
index I use:

str name=bqpublished_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2
/str

And, to provide a boosting that helps rank recently documents higher I
use:
str name=bfrecip(rord(published_date_d),20,5000,5)^5.5/str

This seems to be working well.

But I have more boosting requirements.  For example, I need to boost
documents that are tagged as printed. So, I tried to add another bq
parameter:

str name=bqis_printed_b:true^4.0/str

Also, tried to append this space-separated all in one bq parameter like
this:
str name=bqpublished_date_d:[NOW-3MONTHS/DAY TO NOW/DAY+1DAY]^10.2
is_printed_b:true^4.0/str 

Lastly, I need to apply a negative boost to documents of a certain type,
so I use:
str name=bq-is_mp_parent_b:true^50.0/str

Not sure if it matters, but I have solrQueryParser
defaultOperator=AND/ in schema.xml 

None of those variations return expected results (it's like the bq is
being applied as a filter instead of just applying boosts).

Q1. Can anyone confirm whether bf and bq can both be used together in
solrconfig.xml?
Q2. Is there a way I can do ths using only BF? How? 
Q3. Can I have multiple bq parameters? If so, do I space-separate them
as a single bq, or provide multiple BQs?
Q3. Am I formulating my BQs that use Boolean fields correctly?

Any help or insights much appreciated,

Thanks Dean

CLSA CLEAN  GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.