Re: begins with searches

2009-10-31 Thread Avlesh Singh
 with searches

 
  It sounds from what you say that I'm going to need to change the field
 type
  to edgytext. Which won't achieve the result I want, viz. the current
 all
  plus the edgytext. Any way to achieve this?
 
 I guess there is a mismatch of expectations here. A field can be analyzed
 in
 only ONE way. If your field all is of type text, indexing and searching
 would go through the analyzers (tokenizers and filters) specified ONLY for
 the text field. It does not matter if data from a edgytext or any other
 field type is being copied into the field.

 Having said that converting the all field to type edgytext should still
 work fine. All your regular searches on a text field should also work with
 the edgytext field. Ain't it like that?

 Cheers
 Avlesh

 On Thu, Oct 29, 2009 at 2:52 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  Here's the all code snippets -
 
!-- catchall field, containing all other searchable text fields
  (implemented
 via copyField further on in this schema  --
field name=all type=text indexed=true stored=false
  multiValued=true/
  .
  .
  !-- field for the QueryParser to use when an explicit fieldname is
 absent
  --
   defaultSearchFieldall/defaultSearchField
  .
  .
!-- Copy for ALL search --
copyField source=*_t dest=*_t_ft/
copyField source=*_mt dest=*_mft/
copyField source=content dest=all/
copyField source=*_t dest=all/
copyField source=*_mt dest=all/
 
  It sounds from what you say that I'm going to need to change the field
 type
  to edgytext. Which won't achieve the result I want, viz. the current
 all
  plus the edgytext. Any way to achieve this?
 
  Thanks!
  bern
 
  -Original Message-
  From: Avlesh Singh [mailto:avl...@gmail.com]
  Sent: Wednesday, 28 October 2009 3:30 PM
  To: solr-user@lucene.apache.org
  Subject: Re: begins with searches
 
  
   My next issue relates to how to get the results of the author field
 come
  up
   in a search across all fields. For example, a search on
 author:Houghton,
  B
   (which uses the edgytext) yields 16 documents, but a search on
   all:Houghton, B (which doesn't) yields only 9. I thought the solution
   should be copyfield source=*author_mt dest=all/ but that doesn't
 do
   the trick.
  
 
  Do you have a field called all? How is it set up? Can you post the
  schema.xml snippet relating to this field here?
  copyField is supported for a dynamic field source. copyfield
  source=*author_mt dest=all/ should work for you as long as you have
 a
  field called all defined in your schema. Moreover, for your specific
 use
  case, the all field needs to be of type edgytext.
 
  Cheers
  Avlesh
 
  On Wed, Oct 28, 2009 at 9:35 AM, Bernadette Houghton 
  bernadette.hough...@deakin.edu.au wrote:
 
   Thanks Avlesh. The issue with not doing a phrase query on my edgytext
   field was that my parent application was adding an escape character to
  the
   quotation marks, and I was hoping to fix (or rather, work around) at
 the
   solr end to save maintenance overhead. But I've done a hack in the
 parent
   application to remove those escape chars, and all is working well in
 that
   respect.
  
   My next issue relates to how to get the results of the author field
 come
  up
   in a search across all fields. For example, a search on
 author:Houghton,
  B
   (which uses the edgytext) yields 16 documents, but a search on
   all:Houghton, B (which doesn't) yields only 9. I thought the solution
   should be copyfield source=*author_mt dest=all/ but that doesn't
 do
   the trick.
  
   Thanks!
  
   bern
   -Original Message-
   From: Avlesh Singh [mailto:avl...@gmail.com]
   Sent: Tuesday, 27 October 2009 5:54 PM
   To: solr-user@lucene.apache.org
   Subject: Re: begins with searches
  
   You are right about the parsing of query terms without a double quote
   (solrQueryParser's defaultOperator has to be AND in your case). For
 the
   problem at hand, two things -
  
  1. Do you have any reason for not doing a PhraseQuery (query terms
  enclosed in double quotes) on your edgytext field? If not then you
  can
 always enclose your query in double quotes to get expected begins
  with
 matches.
  2. You can always escape your query string before passing to Solr;
  and
  you wouldn't need to pass your query term in double quotes. For
  exapmle,
 search for the query string - surname, fre when escaped would be
   converted
 into surname,\+fre thereby asking Solr to treat this as a single
 query
   term.
 For more details -
  
  
 
 http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters
   .
 If you use SolrJ, there is a ClientUtils class somewhere in the
 package
 which has helper functions to achieve query escaping.
  
   Cheers
   Avlesh
  
   On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
   bernadette.hough...@deakin.edu.au wrote:
  
Thanks for this suggestion (thanks Gerald

RE: begins with searches

2009-10-29 Thread Bernadette Houghton
G'day Avlesh, converting the all field to type edgytext doesn't work as 
expected as the various text analysers etc don't get to work on that field, 
so I get less results than expected. And adding the edgy filter into the text 
field also yields less results. I can work around the issue by setting up a new 
beginswith edgytext field and using copyfield to copy the relevant fields 
into it. 

But this approach doesn't really suit our parent application's main search 
screen, which is a single box labelled quick search. Users will be puzzled as 
to why a search for beginswith:Houghton, b yields 20 results, while a 
search for Houghton, b yields 10. And also puzzled as to why Houghton, b* 
won't work.as they expect - people are already familiar with using wildcards. A 
way to get around this user perception problem is to get rid of the single 
search box and set up a series of drop down boxes for type of search (begins 
with, etc), along with field names. We might have to go there, but the ideal 
solution from our perspective would be for users to be able to enter terms in 
the quick search box without any field prefix, and have solr go off and 
search all field names/types.

By the way, our text field type config is currently set as -

fieldType name=text class=solr.TextField positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt 
ignoreCase=true expand=false/
--
filter class=solr.ISOLatin1AccentFilterFactory/ 
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory 
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.ISOLatin1AccentFilterFactory/ 
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EnglishPorterFilterFactory 
protected=protwords.txt/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType

Bern


-Original Message-
From: Avlesh Singh [mailto:avl...@gmail.com] 
Sent: Thursday, 29 October 2009 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: begins with searches


 It sounds from what you say that I'm going to need to change the field type
 to edgytext. Which won't achieve the result I want, viz. the current all
 plus the edgytext. Any way to achieve this?

I guess there is a mismatch of expectations here. A field can be analyzed in
only ONE way. If your field all is of type text, indexing and searching
would go through the analyzers (tokenizers and filters) specified ONLY for
the text field. It does not matter if data from a edgytext or any other
field type is being copied into the field.

Having said that converting the all field to type edgytext should still
work fine. All your regular searches on a text field should also work with
the edgytext field. Ain't it like that?

Cheers
Avlesh

On Thu, Oct 29, 2009 at 2:52 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Here's the all code snippets -

   !-- catchall field, containing all other searchable text fields
 (implemented
via copyField further on in this schema  --
   field name=all type=text indexed=true stored=false
 multiValued=true/
 .
 .
 !-- field for the QueryParser to use when an explicit fieldname is absent
 --
  defaultSearchFieldall/defaultSearchField
 .
 .
   !-- Copy for ALL search --
   copyField source=*_t dest=*_t_ft/
   copyField source=*_mt dest=*_mft/
   copyField source=content dest=all/
   copyField source=*_t dest=all/
   copyField source=*_mt dest=all/

 It sounds from what you say that I'm going to need to change the field type
 to edgytext. Which won't achieve the result I want, viz. the current all
 plus the edgytext. Any way to achieve this?

 Thanks!
 bern

 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Wednesday, 28 October 2009 3:30 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 
  My next issue relates to how to get the results of the author field come
 up
  in a search across all fields. For example, a search on author:Houghton,
 B
  (which uses the edgytext

RE: begins with searches

2009-10-28 Thread Bernadette Houghton
Here's the all code snippets -

   !-- catchall field, containing all other searchable text fields (implemented
via copyField further on in this schema  --
   field name=all type=text indexed=true stored=false 
multiValued=true/
.
.
!-- field for the QueryParser to use when an explicit fieldname is absent --
 defaultSearchFieldall/defaultSearchField
.
.
   !-- Copy for ALL search --
   copyField source=*_t dest=*_t_ft/
   copyField source=*_mt dest=*_mft/
   copyField source=content dest=all/
   copyField source=*_t dest=all/
   copyField source=*_mt dest=all/

It sounds from what you say that I'm going to need to change the field type to 
edgytext. Which won't achieve the result I want, viz. the current all plus 
the edgytext. Any way to achieve this?

Thanks!
bern

-Original Message-
From: Avlesh Singh [mailto:avl...@gmail.com] 
Sent: Wednesday, 28 October 2009 3:30 PM
To: solr-user@lucene.apache.org
Subject: Re: begins with searches


 My next issue relates to how to get the results of the author field come up
 in a search across all fields. For example, a search on author:Houghton, B
 (which uses the edgytext) yields 16 documents, but a search on
 all:Houghton, B (which doesn't) yields only 9. I thought the solution
 should be copyfield source=*author_mt dest=all/ but that doesn't do
 the trick.


Do you have a field called all? How is it set up? Can you post the
schema.xml snippet relating to this field here?
copyField is supported for a dynamic field source. copyfield
source=*author_mt dest=all/ should work for you as long as you have a
field called all defined in your schema. Moreover, for your specific use
case, the all field needs to be of type edgytext.

Cheers
Avlesh

On Wed, Oct 28, 2009 at 9:35 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Thanks Avlesh. The issue with not doing a phrase query on my edgytext
 field was that my parent application was adding an escape character to the
 quotation marks, and I was hoping to fix (or rather, work around) at the
 solr end to save maintenance overhead. But I've done a hack in the parent
 application to remove those escape chars, and all is working well in that
 respect.

 My next issue relates to how to get the results of the author field come up
 in a search across all fields. For example, a search on author:Houghton, B
 (which uses the edgytext) yields 16 documents, but a search on
 all:Houghton, B (which doesn't) yields only 9. I thought the solution
 should be copyfield source=*author_mt dest=all/ but that doesn't do
 the trick.

 Thanks!

 bern
 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Tuesday, 27 October 2009 5:54 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 You are right about the parsing of query terms without a double quote
 (solrQueryParser's defaultOperator has to be AND in your case). For the
 problem at hand, two things -

1. Do you have any reason for not doing a PhraseQuery (query terms
enclosed in double quotes) on your edgytext field? If not then you can
   always enclose your query in double quotes to get expected begins with
   matches.
2. You can always escape your query string before passing to Solr; and
you wouldn't need to pass your query term in double quotes. For exapmle,
   search for the query string - surname, fre when escaped would be
 converted
   into surname,\+fre thereby asking Solr to treat this as a single query
 term.
   For more details -

 http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters
 .
   If you use SolrJ, there is a ClientUtils class somewhere in the package
   which has helper functions to achieve query escaping.

 Cheers
 Avlesh

 On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  Thanks for this suggestion (thanks Gerald also: no, we're not using
  BlackLight-type prefixes).
 
  I've set up an edgytext fieldType in schema.xml thus -
 
  fieldType name=edgytext class=solr.TextField
  positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
  maxGramSize=25 /
   /analyzer
   analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
   /analyzer
  /fieldType
 
  And defined a field name thus -
 
  dynamicField name=*author_mt  type=edgytextindexed=true
   stored=true multiValued=true/
 
  The results are mixed -
 
  * searches such as surname, f and surname, fre (with quotations and
  commas) work well, retrieving surname, f, surname, Fred, surname,
  Frederick etc etc
  * searches such as the above but without quotations don't work too well
 as
  they get parsed as author_mt:surname + author_mt:firstname, with solr
  reading the query as author beginning with surname AND author beginning

Re: begins with searches

2009-10-28 Thread Avlesh Singh

 It sounds from what you say that I'm going to need to change the field type
 to edgytext. Which won't achieve the result I want, viz. the current all
 plus the edgytext. Any way to achieve this?

I guess there is a mismatch of expectations here. A field can be analyzed in
only ONE way. If your field all is of type text, indexing and searching
would go through the analyzers (tokenizers and filters) specified ONLY for
the text field. It does not matter if data from a edgytext or any other
field type is being copied into the field.

Having said that converting the all field to type edgytext should still
work fine. All your regular searches on a text field should also work with
the edgytext field. Ain't it like that?

Cheers
Avlesh

On Thu, Oct 29, 2009 at 2:52 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Here's the all code snippets -

   !-- catchall field, containing all other searchable text fields
 (implemented
via copyField further on in this schema  --
   field name=all type=text indexed=true stored=false
 multiValued=true/
 .
 .
 !-- field for the QueryParser to use when an explicit fieldname is absent
 --
  defaultSearchFieldall/defaultSearchField
 .
 .
   !-- Copy for ALL search --
   copyField source=*_t dest=*_t_ft/
   copyField source=*_mt dest=*_mft/
   copyField source=content dest=all/
   copyField source=*_t dest=all/
   copyField source=*_mt dest=all/

 It sounds from what you say that I'm going to need to change the field type
 to edgytext. Which won't achieve the result I want, viz. the current all
 plus the edgytext. Any way to achieve this?

 Thanks!
 bern

 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Wednesday, 28 October 2009 3:30 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 
  My next issue relates to how to get the results of the author field come
 up
  in a search across all fields. For example, a search on author:Houghton,
 B
  (which uses the edgytext) yields 16 documents, but a search on
  all:Houghton, B (which doesn't) yields only 9. I thought the solution
  should be copyfield source=*author_mt dest=all/ but that doesn't do
  the trick.
 

 Do you have a field called all? How is it set up? Can you post the
 schema.xml snippet relating to this field here?
 copyField is supported for a dynamic field source. copyfield
 source=*author_mt dest=all/ should work for you as long as you have a
 field called all defined in your schema. Moreover, for your specific use
 case, the all field needs to be of type edgytext.

 Cheers
 Avlesh

 On Wed, Oct 28, 2009 at 9:35 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  Thanks Avlesh. The issue with not doing a phrase query on my edgytext
  field was that my parent application was adding an escape character to
 the
  quotation marks, and I was hoping to fix (or rather, work around) at the
  solr end to save maintenance overhead. But I've done a hack in the parent
  application to remove those escape chars, and all is working well in that
  respect.
 
  My next issue relates to how to get the results of the author field come
 up
  in a search across all fields. For example, a search on author:Houghton,
 B
  (which uses the edgytext) yields 16 documents, but a search on
  all:Houghton, B (which doesn't) yields only 9. I thought the solution
  should be copyfield source=*author_mt dest=all/ but that doesn't do
  the trick.
 
  Thanks!
 
  bern
  -Original Message-
  From: Avlesh Singh [mailto:avl...@gmail.com]
  Sent: Tuesday, 27 October 2009 5:54 PM
  To: solr-user@lucene.apache.org
  Subject: Re: begins with searches
 
  You are right about the parsing of query terms without a double quote
  (solrQueryParser's defaultOperator has to be AND in your case). For the
  problem at hand, two things -
 
 1. Do you have any reason for not doing a PhraseQuery (query terms
 enclosed in double quotes) on your edgytext field? If not then you
 can
always enclose your query in double quotes to get expected begins
 with
matches.
 2. You can always escape your query string before passing to Solr;
 and
 you wouldn't need to pass your query term in double quotes. For
 exapmle,
search for the query string - surname, fre when escaped would be
  converted
into surname,\+fre thereby asking Solr to treat this as a single query
  term.
For more details -
 
 
 http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters
  .
If you use SolrJ, there is a ClientUtils class somewhere in the package
which has helper functions to achieve query escaping.
 
  Cheers
  Avlesh
 
  On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
  bernadette.hough...@deakin.edu.au wrote:
 
   Thanks for this suggestion (thanks Gerald also: no, we're not using
   BlackLight-type prefixes).
  
   I've set up an edgytext fieldType in schema.xml thus -
  
   fieldType name=edgytext class

Re: begins with searches

2009-10-27 Thread Avlesh Singh
You are right about the parsing of query terms without a double quote
(solrQueryParser's defaultOperator has to be AND in your case). For the
problem at hand, two things -

   1. Do you have any reason for not doing a PhraseQuery (query terms
   enclosed in double quotes) on your edgytext field? If not then you can
   always enclose your query in double quotes to get expected begins with
   matches.
   2. You can always escape your query string before passing to Solr; and
   you wouldn't need to pass your query term in double quotes. For exapmle,
   search for the query string - surname, fre when escaped would be converted
   into surname,\+fre thereby asking Solr to treat this as a single query term.
   For more details -
   
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters.
   If you use SolrJ, there is a ClientUtils class somewhere in the package
   which has helper functions to achieve query escaping.

Cheers
Avlesh

On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Thanks for this suggestion (thanks Gerald also: no, we're not using
 BlackLight-type prefixes).

 I've set up an edgytext fieldType in schema.xml thus -

 fieldType name=edgytext class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=25 /
  /analyzer
  analyzer type=query
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType

 And defined a field name thus -

 dynamicField name=*author_mt  type=edgytextindexed=true
  stored=true multiValued=true/

 The results are mixed -

 * searches such as surname, f and surname, fre (with quotations and
 commas) work well, retrieving surname, f, surname, Fred, surname,
 Frederick etc etc
 * searches such as the above but without quotations don't work too well as
 they get parsed as author_mt:surname + author_mt:firstname, with solr
 reading the query as author beginning with surname AND author beginning
 with firstname, which yields nil results.

 Is there an analyser that will strip the whitespace out altogether? Or
 another alternative?

 bern

 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Monday, 26 October 2009 6:32 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 Read up of setting-up these kind searches here -

 http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

 Cheers
 Avlesh

 On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  We need to offer begins with type searches, e.g. a search for surname,
  f will retrieve surname, firstname, surname, f, surname fm etc.
 
  Ideally, the user would be able to enter something like surname f*.
 
  However, wildcards don't work on phrase searches, nor do range searches.
 
  Any suggestions as to how best to search for begins with phrases; or,
 how
  to best configure solr to support such searches?
 
  TIA
  Bernadette Houghton, Library Business Applications Developer
  Deakin University Geelong Victoria 3217 Australia.
  Phone: 03 5227 8230 International: +61 3 5227 8230
  Fax: 03 5227 8000 International: +61 3 5227 8000
  MSN: bern_hough...@hotmail.com
  Email: bernadette.hough...@deakin.edu.aumailto:
  bernadette.hough...@deakin.edu.au
  Website: http://www.deakin.edu.au
  http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B
  (Vic)
 
  Important Notice: The contents of this email are intended solely for the
  named addressee and are confidential; any unauthorised use, reproduction
 or
  storage of the contents is expressly prohibited. If you have received
 this
  email in error, please delete it and any attachments immediately and
 advise
  the sender by return email or telephone.
  Deakin University does not warrant that this email and any attachments
 are
  error or virus free
 
 



RE: begins with searches

2009-10-27 Thread Bernadette Houghton
Thanks Avlesh. The issue with not doing a phrase query on my edgytext field 
was that my parent application was adding an escape character to the quotation 
marks, and I was hoping to fix (or rather, work around) at the solr end to save 
maintenance overhead. But I've done a hack in the parent application to remove 
those escape chars, and all is working well in that respect.

My next issue relates to how to get the results of the author field come up in 
a search across all fields. For example, a search on author:Houghton, B 
(which uses the edgytext) yields 16 documents, but a search on all:Houghton, 
B (which doesn't) yields only 9. I thought the solution should be copyfield 
source=*author_mt dest=all/ but that doesn't do the trick. 

Thanks!

bern
-Original Message-
From: Avlesh Singh [mailto:avl...@gmail.com] 
Sent: Tuesday, 27 October 2009 5:54 PM
To: solr-user@lucene.apache.org
Subject: Re: begins with searches

You are right about the parsing of query terms without a double quote
(solrQueryParser's defaultOperator has to be AND in your case). For the
problem at hand, two things -

   1. Do you have any reason for not doing a PhraseQuery (query terms
   enclosed in double quotes) on your edgytext field? If not then you can
   always enclose your query in double quotes to get expected begins with
   matches.
   2. You can always escape your query string before passing to Solr; and
   you wouldn't need to pass your query term in double quotes. For exapmle,
   search for the query string - surname, fre when escaped would be converted
   into surname,\+fre thereby asking Solr to treat this as a single query term.
   For more details -
   
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters.
   If you use SolrJ, there is a ClientUtils class somewhere in the package
   which has helper functions to achieve query escaping.

Cheers
Avlesh

On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Thanks for this suggestion (thanks Gerald also: no, we're not using
 BlackLight-type prefixes).

 I've set up an edgytext fieldType in schema.xml thus -

 fieldType name=edgytext class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=25 /
  /analyzer
  analyzer type=query
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
  /analyzer
 /fieldType

 And defined a field name thus -

 dynamicField name=*author_mt  type=edgytextindexed=true
  stored=true multiValued=true/

 The results are mixed -

 * searches such as surname, f and surname, fre (with quotations and
 commas) work well, retrieving surname, f, surname, Fred, surname,
 Frederick etc etc
 * searches such as the above but without quotations don't work too well as
 they get parsed as author_mt:surname + author_mt:firstname, with solr
 reading the query as author beginning with surname AND author beginning
 with firstname, which yields nil results.

 Is there an analyser that will strip the whitespace out altogether? Or
 another alternative?

 bern

 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Monday, 26 October 2009 6:32 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 Read up of setting-up these kind searches here -

 http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

 Cheers
 Avlesh

 On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  We need to offer begins with type searches, e.g. a search for surname,
  f will retrieve surname, firstname, surname, f, surname fm etc.
 
  Ideally, the user would be able to enter something like surname f*.
 
  However, wildcards don't work on phrase searches, nor do range searches.
 
  Any suggestions as to how best to search for begins with phrases; or,
 how
  to best configure solr to support such searches?
 
  TIA
  Bernadette Houghton, Library Business Applications Developer
  Deakin University Geelong Victoria 3217 Australia.
  Phone: 03 5227 8230 International: +61 3 5227 8230
  Fax: 03 5227 8000 International: +61 3 5227 8000
  MSN: bern_hough...@hotmail.com
  Email: bernadette.hough...@deakin.edu.aumailto:
  bernadette.hough...@deakin.edu.au
  Website: http://www.deakin.edu.au
  http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B
  (Vic)
 
  Important Notice: The contents of this email are intended solely for the
  named addressee and are confidential; any unauthorised use, reproduction
 or
  storage of the contents is expressly prohibited. If you have received
 this
  email in error, please delete it and any attachments immediately and
 advise
  the sender by return email or telephone.
  Deakin University does not warrant

Re: begins with searches

2009-10-27 Thread Avlesh Singh

 My next issue relates to how to get the results of the author field come up
 in a search across all fields. For example, a search on author:Houghton, B
 (which uses the edgytext) yields 16 documents, but a search on
 all:Houghton, B (which doesn't) yields only 9. I thought the solution
 should be copyfield source=*author_mt dest=all/ but that doesn't do
 the trick.


Do you have a field called all? How is it set up? Can you post the
schema.xml snippet relating to this field here?
copyField is supported for a dynamic field source. copyfield
source=*author_mt dest=all/ should work for you as long as you have a
field called all defined in your schema. Moreover, for your specific use
case, the all field needs to be of type edgytext.

Cheers
Avlesh

On Wed, Oct 28, 2009 at 9:35 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 Thanks Avlesh. The issue with not doing a phrase query on my edgytext
 field was that my parent application was adding an escape character to the
 quotation marks, and I was hoping to fix (or rather, work around) at the
 solr end to save maintenance overhead. But I've done a hack in the parent
 application to remove those escape chars, and all is working well in that
 respect.

 My next issue relates to how to get the results of the author field come up
 in a search across all fields. For example, a search on author:Houghton, B
 (which uses the edgytext) yields 16 documents, but a search on
 all:Houghton, B (which doesn't) yields only 9. I thought the solution
 should be copyfield source=*author_mt dest=all/ but that doesn't do
 the trick.

 Thanks!

 bern
 -Original Message-
 From: Avlesh Singh [mailto:avl...@gmail.com]
 Sent: Tuesday, 27 October 2009 5:54 PM
 To: solr-user@lucene.apache.org
 Subject: Re: begins with searches

 You are right about the parsing of query terms without a double quote
 (solrQueryParser's defaultOperator has to be AND in your case). For the
 problem at hand, two things -

1. Do you have any reason for not doing a PhraseQuery (query terms
enclosed in double quotes) on your edgytext field? If not then you can
   always enclose your query in double quotes to get expected begins with
   matches.
2. You can always escape your query string before passing to Solr; and
you wouldn't need to pass your query term in double quotes. For exapmle,
   search for the query string - surname, fre when escaped would be
 converted
   into surname,\+fre thereby asking Solr to treat this as a single query
 term.
   For more details -

 http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters
 .
   If you use SolrJ, there is a ClientUtils class somewhere in the package
   which has helper functions to achieve query escaping.

 Cheers
 Avlesh

 On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton 
 bernadette.hough...@deakin.edu.au wrote:

  Thanks for this suggestion (thanks Gerald also: no, we're not using
  BlackLight-type prefixes).
 
  I've set up an edgytext fieldType in schema.xml thus -
 
  fieldType name=edgytext class=solr.TextField
  positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1
  maxGramSize=25 /
   /analyzer
   analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
   /analyzer
  /fieldType
 
  And defined a field name thus -
 
  dynamicField name=*author_mt  type=edgytextindexed=true
   stored=true multiValued=true/
 
  The results are mixed -
 
  * searches such as surname, f and surname, fre (with quotations and
  commas) work well, retrieving surname, f, surname, Fred, surname,
  Frederick etc etc
  * searches such as the above but without quotations don't work too well
 as
  they get parsed as author_mt:surname + author_mt:firstname, with solr
  reading the query as author beginning with surname AND author beginning
  with firstname, which yields nil results.
 
  Is there an analyser that will strip the whitespace out altogether? Or
  another alternative?
 
  bern
 
  -Original Message-
  From: Avlesh Singh [mailto:avl...@gmail.com]
  Sent: Monday, 26 October 2009 6:32 PM
  To: solr-user@lucene.apache.org
  Subject: Re: begins with searches
 
  Read up of setting-up these kind searches here -
 
 
 http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
 
  Cheers
  Avlesh
 
  On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton 
  bernadette.hough...@deakin.edu.au wrote:
 
   We need to offer begins with type searches, e.g. a search for
 surname,
   f will retrieve surname, firstname, surname, f, surname fm etc.
  
   Ideally, the user would be able to enter something like surname f*.
  
   However, wildcards don't work on phrase searches, nor do range
 searches.
  
   Any suggestions as to how best to search for begins

Re: begins with searches

2009-10-26 Thread Avlesh Singh
Read up of setting-up these kind searches here -
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Cheers
Avlesh

On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 We need to offer begins with type searches, e.g. a search for surname,
 f will retrieve surname, firstname, surname, f, surname fm etc.

 Ideally, the user would be able to enter something like surname f*.

 However, wildcards don't work on phrase searches, nor do range searches.

 Any suggestions as to how best to search for begins with phrases; or, how
 to best configure solr to support such searches?

 TIA
 Bernadette Houghton, Library Business Applications Developer
 Deakin University Geelong Victoria 3217 Australia.
 Phone: 03 5227 8230 International: +61 3 5227 8230
 Fax: 03 5227 8000 International: +61 3 5227 8000
 MSN: bern_hough...@hotmail.com
 Email: bernadette.hough...@deakin.edu.aumailto:
 bernadette.hough...@deakin.edu.au
 Website: http://www.deakin.edu.au
 http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B
 (Vic)

 Important Notice: The contents of this email are intended solely for the
 named addressee and are confidential; any unauthorised use, reproduction or
 storage of the contents is expressly prohibited. If you have received this
 email in error, please delete it and any attachments immediately and advise
 the sender by return email or telephone.
 Deakin University does not warrant that this email and any attachments are
 error or virus free




Re: begins with searches

2009-10-26 Thread Gerald Snyder
Are you using the field name suffixes like Blacklight?xxx_text, 
_xxx_facet, xxx_string?   With the xxx_string field you can request 
begins with search, but you may need some different search term 
normalization than with a _text search. 



Gerald Snyder
Florida Center for Library Automation


Bernadette Houghton wrote:

We need to offer begins with type searches, e.g. a search for surname, f will retrieve surname, 
firstname, surname, f, surname fm etc.

Ideally, the user would be able to enter something like surname f*.

However, wildcards don't work on phrase searches, nor do range searches.

Any suggestions as to how best to search for begins with phrases; or, how to 
best configure solr to support such searches?

TIA
Bernadette Houghton, Library Business Applications Developer
Deakin University Geelong Victoria 3217 Australia.
Phone: 03 5227 8230 International: +61 3 5227 8230
Fax: 03 5227 8000 International: +61 3 5227 8000
MSN: bern_hough...@hotmail.com
Email: 
bernadette.hough...@deakin.edu.aumailto:bernadette.hough...@deakin.edu.au
Website: http://www.deakin.edu.au
http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B (Vic)

Important Notice: The contents of this email are intended solely for the named 
addressee and are confidential; any unauthorised use, reproduction or storage 
of the contents is expressly prohibited. If you have received this email in 
error, please delete it and any attachments immediately and advise the sender 
by return email or telephone.
Deakin University does not warrant that this email and any attachments are 
error or virus free


  


RE: begins with searches

2009-10-26 Thread Bernadette Houghton
Thanks for this suggestion (thanks Gerald also: no, we're not using 
BlackLight-type prefixes).

I've set up an edgytext fieldType in schema.xml thus -

fieldType name=edgytext class=solr.TextField positionIncrementGap=100
 analyzer type=index
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=25 
/
 /analyzer
 analyzer type=query
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
 /analyzer
/fieldType

And defined a field name thus -

dynamicField name=*author_mt  type=edgytextindexed=true  
stored=true multiValued=true/

The results are mixed -

* searches such as surname, f and surname, fre (with quotations and commas) 
work well, retrieving surname, f, surname, Fred, surname, Frederick etc 
etc
* searches such as the above but without quotations don't work too well as they 
get parsed as author_mt:surname + author_mt:firstname, with solr reading the 
query as author beginning with surname AND author beginning with firstname, 
which yields nil results. 

Is there an analyser that will strip the whitespace out altogether? Or another 
alternative?

bern

-Original Message-
From: Avlesh Singh [mailto:avl...@gmail.com] 
Sent: Monday, 26 October 2009 6:32 PM
To: solr-user@lucene.apache.org
Subject: Re: begins with searches

Read up of setting-up these kind searches here -
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Cheers
Avlesh

On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton 
bernadette.hough...@deakin.edu.au wrote:

 We need to offer begins with type searches, e.g. a search for surname,
 f will retrieve surname, firstname, surname, f, surname fm etc.

 Ideally, the user would be able to enter something like surname f*.

 However, wildcards don't work on phrase searches, nor do range searches.

 Any suggestions as to how best to search for begins with phrases; or, how
 to best configure solr to support such searches?

 TIA
 Bernadette Houghton, Library Business Applications Developer
 Deakin University Geelong Victoria 3217 Australia.
 Phone: 03 5227 8230 International: +61 3 5227 8230
 Fax: 03 5227 8000 International: +61 3 5227 8000
 MSN: bern_hough...@hotmail.com
 Email: bernadette.hough...@deakin.edu.aumailto:
 bernadette.hough...@deakin.edu.au
 Website: http://www.deakin.edu.au
 http://www.deakin.edu.au/Deakin University CRICOS Provider Code 00113B
 (Vic)

 Important Notice: The contents of this email are intended solely for the
 named addressee and are confidential; any unauthorised use, reproduction or
 storage of the contents is expressly prohibited. If you have received this
 email in error, please delete it and any attachments immediately and advise
 the sender by return email or telephone.
 Deakin University does not warrant that this email and any attachments are
 error or virus free